Category: r

  • A brief idea of style

    Once one starts writing more R code the need for consistency increases, as it facilitates managing larger projects and their maintenance. There are several style guides or suggestions for R; for example, Andrew Gelman’s, Hadley Wickham’s, Bioconductor’s and this one. I tend to write closer to Google’s R style guide, which contains some helpful suggestions.…

  • All combinations for levelplot

    All combinations for levelplot

    In a previous post I explained how to create all possible combinations of the levels of two factors using expand.grid(). Another use for this function is to create a regular grid for two variables to create a levelplot or a contour plot. For example, let’s say that we have fitted a multiple linear regression to…

  • On R versus SAS

    A short while ago there was a discussion on linkedin about the use of SAS versus R for the enterprise. I have thought a bit about the issue but, as I do not use Linkedin, I did not make any comments there. Disclaimer: I did use SAS a lot between 1992 and 1997, mostly for…

  • Linear regression with correlated data

    Linear regression with correlated data

    I started following the debate on differential minimum wage for youth (15-19 year old) and adults in New Zealand. Eric Crampton has written a nice series of blog posts, making the data from Statistics New Zealand available. I will use the nzunemployment.csv data file (with quarterly data from March 1986 to June 2011) and show…

  • R pitfall #1: check data structure

    A common problem when running a simple (or not so simple) analysis is forgetting that the levels of a factor has been coded using integers. R doesn’t know that this variable is supposed to be a factor and when fitting, for example, something as simple as a one-way anova (using lm()) the variable will be…