R pitfalls #4: redefining the basics

I try to be economical when writing code; for example, I tend to use single quotes over double quotes for characters because it saves me one keystroke. One area where I don’t do that is when typing TRUE and FALSE (R accepts T and F as well), just because it is clearer to see in code and syntax highlighting kicks in. That’s why I was surprised to read Jason Morgan’s post in that it is possible to redefine T and F and get undesirable behavior.

Playing around it is quite easy to redefine other fundamental constants in R. For example, I posted in Twitter:

> pi
[1] 3.141593
> pi <- 2
> pi*2
[1] 4

Ouch, dangerous! I tend to muck around with matrices quite a bit and, being a friend of parsimony, I often use capital letters to represent them. This would have eventually bitten me if I had used the abbreviated TRUE and FALSE. As Kevin Ushey replied to my tweet, one can redefine even basic functions like ‘+’ and be pure evil; over the top, sure, but possible.


9 responses to “R pitfalls #4: redefining the basics”

  1. I was doing some Tukeys HSD results and innocently decided to call my vector letters, then wondered why my vector was 26 characters long rather than the 6 it should have been. Guess where R stores its alphabet! Easily done!

    • R is a very large language with lots of reserved (although redefinable) keywords. It pays to pay attention all the time.

      • It certainly does. I always check to make sure that i get what i wanted (which is how i discovered the letters reservation)

  2. Yeah,well, this is true in pretty much every language out there. Heck, I knew folks who thought it was funny to add this line to their coworkers’ .login file: “alias ls=logout” .

  3. I doubt that someone would change “pi”, but T and F might change by accidental reassignment.

    Running some error checking on at least the most common possibilities might be of use. I guess, though that these statements would have to come at the end of your script to make sure nothing was overwritten. Something along the lines of:

    if (!identical(T, TRUE)) stop("'T' has been reassigned to ", T)

    • I doubt that someone would do it on purpose, but I can think of a number of acronyms related to my area of work for which it would make sense to use pi as a variable name.

    • I’ve had some nasty bugs due to redefining T to a transition matrix in my early days of R programming; I was pretty annoyed when I found out the language would let me do something so dangerous.

  4. One redefinition in R is very useful for cross-platform work, e.g. when you are developing a script on a Mac but someone else will also be using it on Windows. This allows all the calls to quartz() to open a new graph window to still function on Windows (and could be easily swapped to go the other way):

    # when collaborating, need to swap windows() vs quartz() calls. This does that nicely:
    # (courtesy of https://stat.ethz.ch/pipermail/r-help/2008-December/181899.html)

    if(.Platform$OS.type==”windows”) {
    quartz<-function() windows()