Category: python (Page 1 of 3)

Audrey Hepburn being all multilanguage and agnostic.

Become an interfaith polyglot

2024-07-22 / Luis

I have been very busy with the start of the semester, teaching regression modelling. The craziest thing was that the R installation was broken in the three computer labs I was allocated to use. It would not have been surprising if I were talking about Python ( 🤣 ), but the installation script had a major bug. Argh!

Anyhow, I was talking with a student who was asking me why we were using R in the course (she already knew how to use Python). If you work in research for a while, particularly in statistics/data analysis, you are bound to bump onto long-lived discussions. It isn’t the Text Editor Wars nor the Operating Systems wars. I am referring to two questions that come up all the time in long threads:

What language should I learn or use for my analyses?
Should I be a Bayesian or a Frequentist? You are supposed to choose a statistical church.

The easy answer for the first one is “because I say so”: it’s my course. A longer answer is that a Domain Specific Language makes life a lot easier, as it is optimised to tasks performed in that domain. An even longer answer points to something deeper: a single language is never enough. My head plays images of Minitab, SAS, Genstat, Splus, R, ASReml, etc that I had to use at some point just to deal with statistics. Or Basic, Fortran, APL (crazy, I know), Python, Matlab, C++, etc that I had to use as more general languages at some point. The choice of language will depend on the problem and the community/colleagues you end up working with. Along your career you become a polyglot.

As an agnostic (in my good days) or an atheist (in my bad ones) I am not prone to join churches. In my research, I tend to use mostly frequentist stats (of the REML persuasion) but, sometimes, Bayesian approaches feel like the right framework. In most of my problems both schools tend to give the same, if not identical results.

I have chosen to be an interfaith polyglot.

Behind the clock at Museé D'Orsay, Paris, France (Photo: Luis, click to enlarge).

Python not suitable platform for reproducible research

2024-01-28 / Luis

While [Active Papers] has achieved its mission of demonstrating that unifying computational reproducibility and provenance tracking is doable and useful, it has also demonstrated that Python is not a suitable platform to build on for reproducible research. Breaking changes at all layers of the software stack are too frequent.
Konrad Hinsen in Archiving Active Papers

I started using Python for my PhD around 1997, to control simulations I wrote using Fortran 90. I chose Python based on Konrad Hinsen’s writings at the time in a long-disappeared website. A few years later I moved all my work to R, which I found much more stable. I have some 20-year-old R base code that still runs. 😇

Incidentally, last year I wrote a series of posts on Some love for base R.

Gratuitous picture: just different hardware (Photo: Luis, click to enlarge).

Flotsam 13: early July links

2013-07-10 / Luis

Man flu kept me at home today, so I decided to do something ‘useful’ and go for a linkathon:

Ed Yong discusses the effect of subject expectations in psychology experiments Nice Results, But What Did You Expect? At the beginning there was another article on The placebo phenomenon, and another one on The placebo defect.
A googleVis tutorial to create Hans Rosling-type graphs from R.
Google’s Python Class is material for an intensive 2-day course on Python.
An opinion piece on Calculus and statistics by Daniel Kaplan, on teaching a different version of your typical introductory calculus course, so it is useful for statistics. He goes as far as teaching calculus using R. There is more information in Project MOSAIC.
Nice graphs on what happened to Asiana Airlines flight 214. I didn’t know there was so much available data for a specific flight.
Biased and Inefficient, Thomas Lumley’s personal statistics blog (he insists that posting 75% of Statschat is not enough to qualify as personal). You may know Thomas from the survey package (or a few others).
If you are a postgrad student in New Zealand you can apply for a NeSI (New Zealand eScience Infrastructure) postgraduate allocation to access high performance computing facilities.
My previous post the USA versus Western Europe comparison of GM corn was the first time that I received more traffic from Facebook than from R-bloggers. Five hundred readers in total.

Over and out.

Gratuitous picture: looking for peace in Japan (Photo: Luis).

Pythonic links

2012-11-29 / Luis

Before I forget: a few links about starting up in Python for scientific projects:

Basic Data Analysis and More—A Guided Tour Using python. PDF in the arxiv.
Python Scientific Lecture Notes. Quick introduction chapters.
The Hitchhiker’s Guide to Python! An opinionated intro to Python.
Numba versus Cython, for when you really need speed.

Now if we had a great Python library for linear mixed models life would be easier.

Late-April flotsam

2012-04-26 / Luis

It has been month and a half since I compiled a list of statistical/programming internet flotsam and jetsam.

Via Lambda The Ultimate: Evaluating the Design of the R Language: Objects and Functions For Data Analysis (PDF). A very detailed evaluation of the design and performance of R. HT: Christophe Lalanne. If you are in statistical genetics and Twitter Christophe is the man to follow.
Attributed to John Tukey, “without assumptions there can be no conclusions” is an extremely important point, which comes to mind when listening to the fascinating interview to Richard Burkhauser on the changes of income for the middle class in USA. Changes to the definition of the unit of analysis may give a completely different result. By the way, does someone have a first-hand reference to Tukey’s quote?
Nature news publishes RNA studies under fire: High-profile results challenged over statistical analysis of sequence data. I expect to see happening more often once researchers get used to upload the data and code for their papers.
Bob O’Hara writes on Why simple models are better, which is not positive towards the machine learning crowd.
A Matlab Programmer’s Take On Julia, and a Python developer interacts with Julia developers. Not everything is smooth. HT: Mike Croucher. ?
Dear NASA: No More Rainbow Color Scales, Please. HT: Mike Dickinson. Important: this applies to R graphs too.
Rafael Maia asks “are programmers trying on purpose to come up with names for their languages that make it hard to google for info?” These are the suggestions if one searches Google for Julia:

I suggest creating a language called Bieber and search for dimension Bieber, loop Bieber and regression Bieber.

That’s all folks.