
Reading a folder with many small files
One of the tools we use in our research is NIR (NearInfrared Spectroscopy), which we apply to thousands of samples to predict their chemical composition. Each NIR spectrum is contained in a CSV text file with two numerical columns: wavelength and reflectance. All files have the same number of rows (1296 in our case), which…

Calculating parliament seats allocation and quotients
I was having a conversation about dropping the minimum threshold (currently 5% of the vote) for political parties to get representation in Parliament. The obvious question is how would seat allocation change, which of course involved a calculation. There is a calculator in the Electoral Commission website, but trying to understand how things work (and…

Collecting results of the New Zealand General Elections
I was reading an article about the results of our latest elections where I was having a look at the spatial pattern for votes in my city. I was wondering how would I go over obtaining the data for something like that and went to the Electoral Commission, which has this neat page with links…

Functions with multiple results in tidyverse
I have continued playing with the tidyverse for different parts of a couple of projects. Often I need to apply a function by groups of observations; sometimes, that function returns more than a single number. It could be something like for each group fit a distribution and return the distribution parameters. Or, simpler for the…

Turtles all the way down
One of the main uses for R is for exploration and learning. Let’s say that I wanted to learn simple linear regression (the bread and butter of statistics) and see how the formulas work. I could simulate a simple example and fit the regression with R: