learning – Palimpsest

I found this text I wrote 20 years ago(*), part of a discussion document I prepared for a review of the radiata pine breeding strategy. Fixed a couple of typos, but I guess we still need pretty much the same thing. 🤔

“Different breeders value different things or, better put, they emphasise different values when developing breeding strategies. One of the reasons why many breeding programs struggle to achieve results is that they face an extremely complex list of activities, which are almost impossible to complete.

“A knee-jerk reaction from some breeders has been to recur to the KISS principle when developing breeding strategies. Unfortunately, the typical reaction has been “let’s create this dumb down strategy because it is simple to apply”. Bzzz. Wrong answer! What they have often done is to create a glorified “deployment strategy” that has almost no chance of surviving in the long term: that is, short term gain based on long term disappointment.

“Breeders need to realise that what needs to be simple is the _interface_ of the strategy. This means that we need a smooth interaction between the “theoretical animal” and the people that will be implementing it. This does not mean that the strategy is theoretically simple, but that the day-to-day activities are a breeze to complete.

“This type of interface requires the development—either in-house or through contracting the service—of tools that make life easy. For example:

Easy access to predicted breeding values, including desktop and online access. In addition, there needs to be an idea of the reliability of those predicted values if we are going to use them for deployment purposes.
Tools that make easy deciding what to select and which trees should be mated with each other (mate selection and allocation).
Protocols for deployment and tools for keeping control of the availability of genetic material.
Easy management of the interaction between improvement and deployment objectives.

“In summary, breeders need tools for dealing with the huge amount of data created by breeding and deployment activities, so it can be transformed into information.”

(*) Well, 19 years ago, this was 2005, but twenty sounds much better.

In the mid-1990s I was at Massey University in Palmerston North, centre of the known universe, where I was doing my PhD. During a short course I met Arthur Gilmour, the creator of ASReml (plain vanilla version, there was no R package yet then). I was really impressed by two things: 1- the software was insanely fast, particularly compared to the SAS scripts I was used to, and 2. How strange the syntax was for anything but the simplest cases.

I was stuck while coding some multivariate analysis, hitting my head against the wall when I complained to Arthur about the syntax. He told me that my problem was not with the syntax but with the matrices. That the syntax represented direct sums and Kronecker products. After that I read the code again, thinking of matrices(*) and suddenly the syntax made sense: there was complexity because the underlying matrix operations were quite exposed in the notation. Exposing these operations was one of the keys that made ASReml so powerful.

Morals of the story:

It helps to have a clue of what the software is supposed to be doing.
Genetic analyses are turtles/matrices all the way down.
Ask if you don’t understand. There is no point on suffering in silence.

(*) Good thing that I had gone through Searle’s “Matrix Algebra Useful for Statistics” guided/pushed by Dorian Garrick. It was hard work, but excellent background for dealing with linear mixed models.

Category: learning

Breeding: simple interfaces, complex strategies

Exposing rather than hiding complexity