Category: r (Page 10 of 20)

Some regressions on school data

2012-09-26 / Luis

Eric and I have been exchanging emails about potential analyses for the school data and he published a first draft model in Offsetting Behaviour. I have kept on doing mostly data exploration while we get a definitive full dataset, and looking at some of the pictures I thought we could present a model with fewer predictors.

The starting point is the standards dataset I created in the previous post:

Updating and expanding New Zealand school data

2012-09-25 / Luis

In two previous posts I put together a data set and presented some exploratory data analysis on school achievement for national standards. After those posts I exchanged emails with a few people about the sources of data and Jeremy Greenbrook-Held pointed out Education Counts as a good source of additional variables, including number of teachers per school and proportions for different ethnic groups.

The code below call three files: Directory-Schools-Current.csv, teacher-numbers.csv and SchoolReport_data_distributable.csv, which you can download from the links.
Continue reading

New Zealand school performance: beyond the headlines

2012-09-24 / Luis

I like the idea of having data on school performance, not to directly rank schools—hard, to say the least, at this stage—but because we can start having a look at the factors influencing test results. I imagine the opportunity in the not so distant future to run hierarchical models combining Ministry of Education data with Census/Statistics New Zealand data.

At the same time, there is the temptation to come up with very simple analyses that would make appealing newspaper headlines. I’ll read the data and create a headline and then I’ll move to something that, personally, seems more important. In my previous post I combined the national standards for around 1,000 schools with decile information to create the standards.csv file.
Continue reading

New Zealand School data

2012-09-24 / Luis

Some people have the naive idea that the national standards data and the decile data will not be put together. Given that at some point in time all data will be available, there is no technical reason to not have merged the data, which I have done in this post for an early release.
Continue reading

(Unsurprisingly) users default to the defaults

2012-09-19 / Luis

Oddities tend to jump out when one uses software in a daily basis. The situation is even clearer when using software for teaching: many more people looking at it with fresh eyes.

Let’s say that we are fitting a simple linear model and we use the summary function, then POW! i- one gets all sorts of stars next to each of the coefficients and ii- some tiny p-values with lots of digits. Since immemorial times (childcare, at least) we got star stickers when doing a good job and here we have R doing the same. It is possible to remove the stars, I know, but the default is the subject of this post.
Continue reading