Posts tagged stats
When Nelder is THAT Nelder
There is a relatively common planting spacing experimental design called a Nelder trial. It is really cool looking from the… -
The Ghost of p-values Past
— …but it doesn’t make any sense! — Please hear me out. — You have one minute, nothing more. —… -
Type safety in R
༄> Introduction to r-lib type safety checks. -
Why do anova(type=’marginal’) and anova(type=’III’) yield different results on lmer() models?
༄> A good answer in StackExchange. -
Time for correlations
A few posts ago I was talking about heritabilities (like here) and it’s time to say something about genetic correlations.… -
Start with the programming language and statistical approach used by your community
I have been very busy with the start of the semester, teaching regression modelling. The craziest thing was that the… -
Back of the envelope calculations: pulp mill
Imagine that someone stops you on the street and asks “How many hectares of plantations do we need for a… -
Exposing rather than hiding complexity
In the mid-1990s I was at Massey University in Palmerston North, centre of the known universe, where I was doing… -
Potential material for teaching
༄> Statistics for ecologists -
Superindex and subindex in ggpairs axes labels
༄> I was having problems on the syntax to get the axis labels with subindices and superindices, as it didn’t work… -
greenR: Green spaces in R
༄> Yesterday I was attending the Urban Forest Futures conference and there were several interesting presentations. Here is a couple of… -
duckplyr: dplyr + DuckDB
༄> DuckDB released a new R package – duckplyr, which enables running dplyr functions using the DuckDB engine on the backend… -
I haven’t done an animation in R in ages
༄> So I needed to remember how to do it. Thie post “Building an animation step-by-step with gganimate” is pretty helpful:… -
What Would Akaike Do?
This AIC looks way more fun than the other AIC for (soft toy) model selection. -
Some love for Base R. Part 4
Following on parts 1, 2 & 3—yes, a series—we arrive to part 4 revisiting Base R. See part 1 for… -
Anyone using other than RStudio?
I asked both in Mastodon and Twitter “Anyone using other than #RStudio as their main #rstats IDE?” and—knowing that some… -
Infrequent doesn’t disprove
皿 There is no logical warrant for considering an event known to occur in a given hypothesis, even if infrequently, as… -
Sense-checking data
Over the birdsite dumpster fire. Emily Harvey was asking (I had to remove the link because, sensibly, she closed her account): -
Some love for Base R. Part 3
It seems a few people have found useful the reminders of base-R functionality covered in “Some love for Base R”… -
Some love for Base R. Part 2
Where were we? Giving some love to base-R and putting together the idea that it is possible to write R… -
Some love for Base R. Part 1
For a long time it has bothered me when people look down at base-R (meaning the set of functions that… -
Not a contribution to science
皿 Null hypotheses of no difference are usually known to be false before the data are collected … when they are,… -
The data may not contain the answer
皿 The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can… -
Flotsam 15: inference
༄> Before I lose the link—as I’m deleting toots & tweets two weeks after I post the—I should save the address… -
Creating an n x n autocorrelation matrix
Between covid-19 news and announcements of imminent Russia-Ukraine wars I needed a bit of a distraction. Sooo, here it is… -
The beauty of code vectorisation
I came across this problem in Twitter: -
Fixing Rcpp warning in Mac OS
In Mac OS I was getting an annoying warning when compiling Cpp code via Rcpp in R: -
Implementing a model as an R package
In our research group we often have people creating statistical models that end up in publications but, most of the… -
Reading a folder with many small files
One of the tools we use in our research is NIR (Near-Infrared Spectroscopy), which we apply to thousands of samples… -
Being data curious: the strange case of lamb consumption in NZ
There is a lot of talk about the skills needed for working in Statistics/Data Science, with the discussion often focusing… -
Collecting results of the New Zealand General Elections
I was reading an article about the results of our latest elections where I was having a look at the… -
Functions with multiple results in tidyverse
I have continued playing with the tidyverse for different parts of a couple of projects. -
Turtles all the way down
One of the main uses for R is for exploration and learning. Let’s say that I wanted to learn simple… -
Old dog and the tidyverse
I started using R ages ago and have happily lived in mostly-base-R for data manipulation. Once in a while I… -
Cute Gibbs sampling for rounded observations
I was attending a course of Bayesian Statistics where this problem showed up: -
Mucking around with maps, schools and ethnicity in NZ
I’ve been having a conversation for a while with Harkanwal Singh and Aaron Schiff on maps, schools, census, making NZ… -
Back of the envelope look at school decile changes
Currently there is some discussion in New Zealand about the effect of the reclassification of schools in socioeconomic deciles. An… -
Comment on Sustainability and innovation in staple crop production in the US Midwest
After writing a blog post about the paper “Sustainability and innovation in staple crop production in the US Midwest” I… -
Sometimes I feel (some) need for speed
I’m the first to acknowledge that most of my code could run faster. The truth of the matter is that,… -
Less wordy R
The Swarm Lab presents a nice comparison of R and Python code for a simple (read ‘one could do it… -
R as a second language
Imagine that you are studying English as a second language; you learn the basic rules, some vocabulary and start writing… -
Teaching linear models
I teach several courses every year and the most difficult to pull off is FORE224/STAT202: regression modeling. -
Statistics unplugged
How much does statistical software help and how much it interferes when teaching statistical concepts? Software used in the practice… -
Using Processing and R together (in OS X)
I wanted to develop a small experiment with a front end using the Processing language and the backend calculations in… -
Excel, fanaticism and R
This week I’ve been feeling tired of excessive fanaticism (or zealotry) of open source software (OSS) and R in general.… -
Flotsam 13: early July links
༄> Man flu kept me at home today, so I decided to do something ‘useful’ and go for a linkathon: -
My take on the USA versus Western Europe comparison of GM corn
A few days ago I came across Jack Heinemann and collaborators’ article (Sustainability and innovation in staple crop production in… -
GM-fed pigs, chance and how research works
Following my post on GM-fed pigs I received several comments, mostly through Twitter. Some people liked having access to an… -
Ordinal logistic GM pigs
This week another ‘scary GMO cause disease’ story was doing the rounds in internet: A long-term toxicology study on pigs… -
Analyzing a simple experiment with heterogeneous variances using asreml, MCMCglmm and SAS
I was working with a small experiment which includes families from two Eucalyptus species and thought it would be nice… -
Subsetting data
At School we use R across many courses, because students are supposed to use statistics under a variety of contexts.… -
An R wish list for 2013
First go and read An R wish list for 2012. None of the wishes came through in 2012. Fix the… -
My R year
End-of-year posts are corny but, what the heck, I think I can let myself delve in to corniness once a… -
Matrix Algebra Useful for Statistics
I was having a conversation with an acquaintance about courses that were particularly useful in our work. My forestry degree… -
When R, or any other language, is not enough
This post is tangential to R, although R has a fair share of the issues I mention here, which include… -
Multisite, multivariate genetic analysis: simulation and analysis
The email wasn’t a challenge but a simple question: Is it possible to run a multivariate analysis in multiple sites?… -
More sense of random effects
I can’t exactly remember how I arrived to “Making sense of random effects” (original post is gone), in the Distributed… -
Overlay of design matrices in genetic analysis
I’ve ignored my quantitative geneticist side of things for a while (at least in this blog) so this time I’ll… -
A word of caution: the sample may have an effect
This week I’ve tried to i-stay mostly in the descriptive statistics realm and ii-surround any simple(istic) models with caveats and… -
Some regressions on school data
Eric and I have been exchanging emails about potential analyses for the school data and he published a first draft… -
Updating and expanding New Zealand school data
In two previous posts I put together a data set and presented some exploratory data analysis on school achievement for… -
New Zealand school performance: beyond the headlines
I like the idea of having data on school performance, not to directly rank schools—hard, to say the least, at… -
New Zealand School data
Some people have the naive idea that the national standards data and the decile data will not be put together.… -
(Unsurprisingly) users default to the defaults
Oddities tend to jump out when one uses software in a daily basis. The situation is even clearer when using… -
Suicide statistics and the Christchurch earthquake
Suicide is a tragic and complex problem. This week New Zealand’s Chief Coroner released its annual statistics on suicide, which… -
m x n matrix with randomly assigned 0/1
Today Scott Chamberlain tweeted asking for a better/faster solution to building an m x n matrix with randomly assigned 0/1.… -
Mid-August flotsam
༄> Reached mid-semester point, with quite a few new lectures to prepare. Nothing extremely complicated but, as always, the tricky part… -
INLA: Bayes goes to Norway
INLA is not the Norwegian answer to ABBA; that would probably be a-ha. INLA is the answer to ‘Why do… -
Careless comparison bites back (again)
When running stats labs I like to allocate a slightly different subset of data to each student, which acts as… -
Early August flotsam
༄> Back teaching a couple of subjects and it’s the constant challenge to find enough common ground with students so one… -
Split-plot 2: let’s throw in some spatial effects
Disappeared for a while collecting frequent flyer points. In the process I ‘discovered’ that I live in the middle of… -
Split-plot 1: How does a linear mixed model look like?
I like statistics and I struggle with statistics. Often times I get frustrated when I don’t understand and I really… -
Review: “Forest Analytics with R: an introduction”
Forestry is the province of variability. From a spatial point of view this variability ranges from within-tree variation (e.g. modeling… -
R’s increasing popularity. Should we care?
Some people will say ‘you have to learn R if you want to get a job doing statistics/data science’. I… -
Bivariate linear mixed models using ASReml-R with multiple cores
A while ago I wanted to run a quantitative genetic analysis where the performance of genotypes in each site was… -
Teaching code, production code, benchmarks and new languages
I’m a bit obsessive with words. May be I should have used learning in the title, rather than teaching code.… -
R, Julia and genome wide selection
— “You are a pussy” emailed my friend. — “Sensu cat?” I replied. — “No. Sensu chicken” blurbed my now ex-friend. -
If you have to use circles…
Stats Chat is an interesting kiwi site—managed by the Department of Statistics of the University of Auckland—that centers around the… -
Revisiting homicide rates
A pint of R plotted an interesting dataset: intentional homicides in South America. I thought the graphs were pretty but… -
Oracle’s strange understanding of R users
After reading David Smith’s on the price of Oracle R Enterprise (actually free, but it requires Oracle Data Mining at… -
Rstudio and asreml working together in a mac
December and January were crazy months, with a lot of travel and suddenly I found myself in February working in… -
Mid-January flotsam: teaching edition
༄> I was thinking about new material that I will use for teaching this coming semester (starting the third week of… -
R is a language
A commenter on this blog reminded me of one of the frustrating aspects faced by newbies, not only to R… -
Doing Bayesian Data Analysis now in JAGS
Around Christmas time I presented my first impressions of Kruschke’s Doing Bayesian Data Analysis. This is a very nice book… -
Plotting earthquake data
Since 4th September 2010 we have had over 2, 800 quakes (considering only magnitude 3+) in Christchurch. Quakes come in… -
An R wish list for 2012
I expect there will be many reviews and wish lists for R this year, with many of them focusing on… -
First impressions of Doing Bayesian Data Analysis
About a month ago I was discussing the approach that I would like to see in introductory Bayesian statistics books.… -
R pitfall #3: friggin’ factors
I received an email from one of my students expressing deep frustration with a seemingly simple problem. He had a… -
Tall big data, wide big data
After attending two one-day workshops last week I spent most days paying attention to (well, at least listening to) presentations… -
R, academia and the democratization of statistics
I am not a statistician but I use statistics, teach statistics and write about applications of statistics in biological problems. -
On the (statistical) road, workshops and R
Things have been a bit quiet at Quantum Forest during the last ten days. Last Monday (Sunday for most readers)… -
If you are writing a book on Bayesian statistics
This post is somewhat marginal to R in that there are several statistical systems that could be used to tackle… -
No one would ever conceive
皿 I believe that no one who is familiar, either with mathematical advances in other fields, or with the range of… -
Do we need to deal with ‘big data’ in R?
David Smith at the Revolutions blog posted a nice presentation on “big data” (oh, how I dislike that term). It… -
Surviving a binomial mixed model
A few years ago we had this really cool idea: we had to establish a trial to understand wood quality… -
On “true” models
皿 Before starting the description of the probability distributions, we want to impose on the reader the essential feature that a… -
Coming out of the (Bayesian) closet: multivariate version
This week I’m facing my—and many other lecturers’—least favorite part of teaching: grading exams. In a supreme act of procrastination… -
Coming out of the (Bayesian) closet
Until today all the posts in this blog have used a frequentist view of the world. I have a confession… -
Teaching with R: the tools
I bought an Android phone, nothing fancy just my first foray in the smartphone world, which is a big change… -
Multivariate linear mixed models: livin’ la vida loca
I swear there was a point in writing an introduction to covariance structures: now we can start joining all sort… -
Covariance structures
In most mixed linear model packages (e.g. asreml, lme4, nlme, etc) one needs to specify only the model equation (the… -
Longitudinal analysis: autocorrelation makes a difference
Back to posting after a long weekend and more than enough rugby coverage to last a few years. Anyway, back… -
Teaching with R: the switch
There are several blog posts, websites (and even books) explaining the transition from using another statistical system (e.g. SAS, SPSS,… -
Spatial correlation in designed experiments
Last Wednesday I had a meeting with the folks of the New Zealand Drylands Forest Initiative in Blenheim. In addition… -
Large applications of linear mixed models
In a previous post I summarily described our options for (generalized to varying degrees) linear mixed models from a frequentist… -
Linear mixed models in R
A substantial part of my job has little to do with statistics; nevertheless, a large proportion of the statistical side… -
Maximum likelihood
This post is one of those ‘explain to myself how things work’ documents, which are not necessarily completely correct but… -
Simulating data following a given covariance structure
Every year there is at least a couple of occasions when I have to simulate multivariate data that follow a… -
Upgrading R (and packages)
I tend not to upgrade R very often—running from 6 months to 1 year behind in version numbers—because I had… -
On R versus SAS
A short while ago there was a discussion on linkedin about the use of SAS versus R for the enterprise.… -
Linear regression with correlated data
I started following the debate on differential minimum wage for youth (15-19 year old) and adults in New Zealand. Eric… -
R pitfall #1: check data structure
A common problem when running a simple (or not so simple) analysis is forgetting that the levels of a factor… -
A shoebox for data analysis
Recidivism. That’s my situation concerning this posting flotsam in/on/to the ether. I’ve tried before and, often, will change priorities after… -
Python code to simulate the Monty Hall problem
I always struggled to understand the Monty Hall Problem, which popped up again in William Briggs’s site. The program assumes…