Evolving notes, images and sounds by Luis Apiolaza

Category: breeding (Page 1 of 5)

Time for correlations

A few posts ago I was talking about heritabilities (like here) and it’s time to say something about genetic correlations. This is how I explain correlations to myself or in meetings with colleagues. Nothing formal, mostly an analogy.

Say we have to draw a distribution of breeding values for one trait (X) and, rather than looking from the side, we look at it from the top. It looks like a straight line, where the length gives an idea of variability and the cross marks the mean. We can have another distribution (Y), perhaps not as long (so not so variable) or maybe longer.

Often variables will vary together (co-vary, vary at the same time) and we can show that by drawing the lines at an angle, where they cross at their means. If you look at the formula for the covariance (co-variance, because traits co-vary, get it?), we grab the deviation from the mean for the two traits for each of the observations, multiply them, add them all up and get their average. We get positive values for the product when both traits are above or below the mean; we get negative values when one trait is below the mean and the other above it. Covariances are a pain, as they can take any value. Instead we can use “standardised” covariances, that vary between -1 and 1: we call these things *correlations*.

If the angle between the distributions is less than 90 degrees, increasing the values of one of the traits is (on average) accompanied by an increase on the other trait. then we have a positive covariance and, therefore, a positive correlation. The smaller the angle, the closer to a correlation of 1.

If the angle is 0 degrees (or close to it), changing the value of one trait has no (or very little) effect on the other trait. Zero correlation.

If the angle is greater than 90 degrees, changing the value of one trait tends to reduce the values of the other trait. The closer the angle to 180 degrees (so the positive values of one distribution are closer to the negative values of the other distribution), the closer to a -1 correlation.

Why do we care about these correlations? We use them all over the place in breeding. Sometimes as a measure of trade-off, as in “if I increase X, what will happen with Y?” or correlated response to selection. We also use them to understand how much information in one trait is contained in another trait, as in “can I use X as a selection criteria for Y?”. And a bunch of other uses, as well. But that’s another post.

Diagram showing correlations as angles.

When heritability is high but the phenotype is dominated by the environment

I was reading a LinkedIn post that said “heritability is the extent to which differences in observed phenotypes can be attributed to genetic differences”.

There is this idea floating around assuming that if a trait is highly heritable, therefore genetics explains most differences we observe. I have seen it many times, both when people discuss breeding and even in political discussions. I vividly remember a think tank commentator stating that given IQ was highly heritable it is likely that millionaires make more money because their parents were more intelligent, or something along those lines.

I created the figure below using a dataset with wood basic density measurements (how much solid “stuff” you have in a set volume of wood) for trees growing in 17 different environments. The heritability of wood density is around 0.6; however, the differences between some environments are larger than the differences within environments.

We have to remember that heritabilities apply to specific populations and specific environments. Moreover, if we think of the mixed model analysis, we are fitting both fixed and random effects, so we are “correcting/controlling/putting individuals on the same footing” with our fixed effects, before having a look at the variation that is left over. We are then saying that out of that left over genetics explains a proportion of the variation (this is much smaller than the variation before accounting for other sources of variability).

In the case of wood density of radiata pine, the environment (particularly temperature explained by latitude and elevation and soil nutrients like boron) has a larger effect than genetics when looking across multiple trials. The trials with higher density are farther North in New Zealand, which is warmer. Once we are inside one of the trials, genetics explains 60% of the variability. In the same way, once we account for all other social differences, we are left with a much smaller level of variability to try explaining income differences with genetics.

Wood variability for trees in 17 progeny trials in New Zealand.

Breeding trade-offs

On one side, it is obvious what we should do: increase any of the values in the numerator (selection intensity, accuracy and genetic variability) or reduce the denominator (how long it takes us to deliver gain). Any of those changes will increase genetic gain per year.

However, the world is full of trade-offs. First, that equation is for a single trait and our breeding programmes deal with multiple traits, so we are selecting on an index that combines the genetic information for all traits (their genetic variability, heritabilities, and correlations) with their relative economic value. Not all the traits have the same value for industry. And not all the traits cost the same to assess: measuring an external characteristic, say size, is a lot easier than measuring internal characteristics, say chemical composition.

Perhaps it is convenient to sacrifice accuracy, using a second- or third-best method for phenotyping, if we can assess more cheaply and quickly (increasing selection intensity). Perhaps it is convenient to clone our testing material (reducing effective population size), so we genotype once but test in multiple environments for multiple traits. Or we can redefine the traits, so we are not trying to predict a specific value but just check if we meet technical/quality thresholds.

There are many other options and that’s why the (more general version of the) breeder’s equation is central in what we do. It permits us to play with ideas, run alternatives and adapt our breeding programmes to whatever conditions we are facing. Sometimes it is super-duper high-throughput hyperspectral drone-enabled goodness. Sometimes is low-budget el-quicko back-of-a-workshop “appropriate” technology. Same equation, same decisions.

Having a peek at sheep breeding

One of the cool things about Quantitative Genetics is that it works everywhere. As a forester, I work with trees and my analyses reflect that, accounting for the biological constraints of our species (long-lived, usually, but not always, monoecious species—both sexes in the same individual), experimental designs (often incomplete-block), relatively shallow pedigrees (we started a few generations ago), etc.

However, as a Forestry undergrad I chose to take a Quantitative Genetics course in the Department of Animal Science at the Universidad de Chile. The examples used rabbits, sheep, etc. but the equations were directly applicable to trees. As a postgrad, I was, again, in the Department of Animal Science (at Massey this time) and the courses and discussions were mostly about cows. Unsurprisingly, the equations were directly applicable to trees.

Last week, I was fitting a multivariate animal-model BLUP with trees but, with small changes, you could use the code for cows, or rabbits, or wheat, or potatoes. This means that we, quantitative geneticists, get to be interested in the developments in other industries.

That was a long preamble! The thing is that I came across these article in Radio New Zealand: What’s the model sheep of the future? where there was a link to the nProve system “a free online tool for farmers wanting to identify breeders producing rams suitable for their own operation” developed by Beef+Lamb New Zealand. I HAD to look at nProve, of course, and there was one thing that really grabbed my attention: there is a very large number of traits that can be used to select rams, including multiple terminal indices, health indices, or just play directly with the breeding values for specific traits. There are regions in the country too.

It looks like a great tool to help farmers and I imagine that there must be substantial work communicating the tool to farmers. Just in case, here is a sort of equivalent tool for radiata pine in New Zealand: TopTree.

There is value in better explanations

I am often fascinated by people who can explain something that I already know but in a *much better* way. For example, Howie Hua is great at looking at mathematical issues (geometric mean, in this example) and coming up with a simple, straightforward way of presenting it; a way that feels fresh.

If we can explain things (here is my attempt at “changes of rankings”), if we can socialise, if we can share a common understanding with colleagues, customers, community; then we can do a much better job in our projects. This is true in forestry, breeding and, I guess, pretty much any activity.

« Older posts

© 2024 Palimpsest

Theme by Anders NorenUp ↑