asreml-r, multiple sites
This page shows how to move from a univariate to a multivariate data format to
perform a multiple site analysis, considering each site as a different trait.
In this example, the canterbury.csv
data set contains only three variables:
Site, Id (they correspond to genotypes) and yield. Site has three levels
(Avonhead, Ilam and Yaldhurst) indicating the sites and yield is a measure of
productivity.
# Loading asreml-r package
library(asreml)
# Reading csv file. You could also use asreml.read.table
d <- read.csv('~/examples/canterbury.csv', header = TRUE)
# Creating a yield variable for each site, with NA for the
# other two sites (NA = missing value in R)
d$y1 <- ifelse(d$Site == 'Avonhead', d$yield, NA)
d$y2 <- ifelse(d$Site == 'Ilam', d$yield, NA)
d$y3 <- ifelse(d$Site == 'Yaldhurst', d$yield, NA)
We can now run the equivalent to three univariate analyses,
by binding the three columns (using cbind
) that contain each of the
different traits: y1, y2 and y3. The diagonal structure (diag
) is stating
that there are no covariances between the traits for Id
and between error
terms (units
).
## Very simple multivariate model
m1 <- asreml(cbind(y1,y2,y3) ~ trait,
random = ~ diag(trait):Id,
rcov = ~ units:diag(trait),
data = d)
summary(m1)$varcomp
We can confirm that the runs are equivalent to univariate analyses by fitting the univariate models. For example:
# It should be equivalent to site specific univariate runs
# for example, for site 1
s1 <- asreml(yield ~ 1, random = ~ Id, data = d,
subset = Site == 'Yaldhurst')
summary(s1)$varcomp
# Yes, it is the same. Compare components for site 3.
We can now complicate the multivariate analyses to allow for correlation between sites, which is really what we are after.
# First I make up some starting values, assuming a genetic correlation
# of 0.7 between sites and variances from model 1 (m1)
# corgh fits a correlation structure with heterogeneous variances
start.values <- c(0.7, 0.7, 0.7, 3.5, 0.3, 2.4)
m2 <- asreml(cbind(y1,y2,y3) ~ trait,
random = ~ corgh(trait, init = start.values):Id,
rcov = ~ units:diag(trait),
data = d)
summary(m2)$varcomp