The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
John W Tukey in Sunset Salvo. 1986. The American Statistician 40(1): 72-76.
Category: stats (Page 2 of 8)
Before I lose the link—as I’m deleting toots & tweets two weeks after I post the—I should save the address for “Introduction to Modern Causal Inference” by Alejandro Schuler and Mark van der Laan. It is a book draft that looks quite readable.
Read more: Flotsam 15: inferenceAlso love Xanthe Tynehorne, Esq.’s Compendium of Curious Words. Weird enough to make it interesting.
Count me fascinated by the Literature Clock by Johs Enevoldsen, which presents a text from a novel, poem, etc with the time of your computer clock.
I have kept on adding links until 5th February:
This Bayesian Data Analysis course, by Aki Vehtari, based on the classic BDA3 book (link to the free online version) looks really interesting. Even more so if you already have done some Bayesian stats work/study before.
Against Copyediting: Is It Time to Abolish the Department of Corrections? by Helen Rubinstein got me thinking about how we “correct” while editing texts, in my case mostly writings by postgrad students.
In our research group we often have people creating statistical models that end up in publications but, most of the time, the practical implementation of those models is lacking. I mean, we have a bunch of barely functioning code that is very difficult to use in a reliable way in operations of the breeding programs. I was very keen on continue using one of the models in our research, enough to rewrite and document the model fitting, and then create another package for using the.model in operations.
Unfortunately, neither the data nor the model are mine to give away, so I can’t share them (yet). But I hope these notes will help you in you are in the same boat and need to use your models (or ‘you’ are in fact future me, who tend to forget how or why I wrote code in a specific way).
Continue readingThere is a lot of talk about the skills needed for working in Statistics/Data Science, with the discussion often focusing on theoretical understanding, programming languages, exploratory data analysis, and visualization. There are many good blog posts dealing with how you get data, process it with your favorite language and then creating some good-looking plots. However, in my opinion, one important skill is curiosity; more specifically being data curious.
Often times being data curious doesn’t require statistics or coding, but just searching for and looking at graphs. A quick example comes from Mike Dickinson’s tweet: “This is extraordinary: within a decade, NZers basically stopped eating lamb. 160 years of tradition scrapped almost overnight.” Continue reading
I have continued playing with the tidyverse for different parts of a couple of projects.
Often I need to apply a function by groups of observations; sometimes, that function returns more than a single number. It could be something like for each group fit a distribution and return the distribution parameters. Or, simpler for the purposes of this exploration, calculate and return a bunch of numbers.
Continue reading