Shame on you, R... again! (But not really...)

Monday, January 17, 2011 at 2:25 PM Bookmark and Share
Remember how a few months ago I lamented the fact that the round() function in R uses a non-standard rule for rounding to the nearest integer?  Instead of rounding k+0.5 to k+1 (k being an integer) R rounds to whichever integer k or k+1 is even.  Well here's another example of R offending our mathematical sensibilities... R seems to think that even though
1 * Inf = Inf
somehow it can get away with telling us that
1 * (Inf + 0i) = Inf + NaNi?


"Gasp!" I know, insane, right? What's going on here? Whatever happened to "anything times one is equal to that same number"? Granted, infinity isn't really a number so sometimes we can't assign a value to an expression like Inf*0, but deep down inside I can't shake the feeling that 1 * Inf really should be Inf!

It turns out that R and I are both right - we're just making different assumptions about how we interpret all these 1s, 0s and Infs in these two statements. Let me explain...

Despite using sound, puppy-approved logic in this case, R gives the offending result because of how it implements everyone's favorite section in Calculus class: computing limits.  To understand why, take a closer look at how the multiplication is happening in each case above.  The first case is hopefully straightforward.  In the second case 1 is treated as a complex number instead of a scalar which gives 
1*(Inf+0i) = (1+0i)*(Inf+0i)  =           
           = Inf + (0*Inf+0)i = Inf + NaNi

We could also throw in a third case and multiply these two complex numbers in the more natural context of polar coordinates. Writing each in terms of their modulus r (distance from the origin) and argument θ (angle off of the positive real axis) instead of in terms of their real and imaginary parts, we have
   1*(Inf+0i) = (1+0i)*(Inf+0i)       
             = 1exp(i0)* Inf exp(i0)
        = (1*Inf) exp (i0)
   = Inf exp(i0)
= Inf + 0i
Whew! So what's "wrong" with multiplying things in x+yi form??

R recognizes that any computations involving infinity really require the algebra of limits, and acts appropriately (albeit conservatively) to evaluate such expressions. This discord then comes from what R assumes is the result of taking some limit and what is to be treated as a constant. Unless you've taught a calculus class recently some explanation might be in order.

In general, expressions involving infinity are treated as limits where some unspecified variable is going to infinity:  For example, statements like Inf*0 can't be assigned a value because in it's most general interpretation we're asking "What is the limit of the product of x*y as x→Inf and y→0?"  Here, whether y goes to 0 from above (e.g. y=1/x) or below (y=-1/x) or neither will determine where the limit of the product goes to zero, some non-zero number, plus or minus infinity, or will have no limit at all. (Open any calculus text to the sections on limits for examples leading to these different outcomes).  Note this example does have an answer if y is always assumed to be 0, since it's always the case that x*0=0.

That means that, depending on how we interepret the zero, our example might equal either
  Inf*0=NaN
Inf*0=0
This is exactly what's going on above.

Returning to the two statements at the top of this post, we can now understand why R gives these two different answers.  By making the zero implicit vs. explicit R treats these expressions differently. R interprets Inf as "the limit of x + 0i as x→Inf," allowing for the result that
1 * Inf = Inf
whereas in the second case R treats Inf + 0i as "the limit of x + yi as x → Inf and y → 0" which has no general answer  and therefore gets assigned a value of NaN.

The take-home message: as soon as there's an Inf in an expression, R proceeds assuming everything is a limit, even though it might be clear to the user that some of those key 1s and 0s should be treated as constants.

Data Visualization in R: Part... 0

Friday, January 14, 2011 at 3:26 PM Bookmark and Share
I haven't forgotten that I promised to do a series of posts on data visualization using R - just a bit busy catching up after some excellent holiday R&R. Hopefully I'll get a post out soon!

In the mean time, check out these two posts from the R-bloggers network.