R, academia and the democratization of statistics

I am not a statistician but I use statistics, teach statistics and write about applications of statistics in biological problems.

Last week I was in this biostatistics conference, talking with a Ph.D. student who was surprised about this situation because I didn’t have any statistical training. I corrected “any formal training”. On the first day one of the invited speakers was musing about the growing number of “amateurs” using statistics—many times wrongly—and about what biostatisticians could offer as professional value-adding. Yes, he was talking about people like me spoiling the party.

Twenty years ago it was more difficult to say “cool, I have some data and I will run some statistical analyses” because (and you can easily see where I am going here) access to statistical software was difficult. You were among the lucky ones if you were based at a university or a large company, because you had access to SAS, SPSS, MINITAB, etc. However, you were out of luck outside of these environments, because there was no way to easily afford a personal licence, not for a hobby at least. This greatly limited the pool of people that could afford to muck around with stats.

Gratuitous picture: spiral in Sydney. — GratuitousSydney

Enter R and other free (sensu gratis) software that allowed us to skip the specialist, skip the university or the large organization. Do you need a formal degree in statistics to start running analyses? Do you even need to go through a university (for any degree, it doesn’t really matter) to do so? There are plenty of resources to start, download R or a matrix language, get online tutorials and books, read, read, read and ask questions in email lists or fora when you get stuck. If you are a clever cookie—and some of you clearly are one—you could easily cover as much ground as someone going through a university degree. It is probably still not enough to settle down, but it is a great start and a great improvement over the situation twenty years ago.

This description leaves three groups in trouble, trying to sort out their “value-adding” ability: academia, (bio)statisticians and software makers. What are universities offering, that’s unique enough, to justify the time and money invested by individuals and governments? If you make software, What makes it special? For how long can you rely on tradition and inertia so people don’t switch to something else? What’s so special about your (bio) statistical training to justify having one “you” in the organization? Too many questions, so better I go to sleep.

P.S. Did I write “value-adding” ability? I must have been talking to the suits for too long… Next post I may end up writing “value-adding proposition”!