1Pergamino, Argentina.
EEA
INTA Pergamino.
2Cordoba, Argentina
Estad’stica y Biometr’a (FCA-UNC).
3Buenos Aires, Argentina
Departamento de Producci—n Animal (FAUBA)
Prediction of maize (Zea mays L.) combining ability using molecular markers and
mixed linear models theory
-- Ornella1[ ],
L; Eyherabide1, G; di Rienzo2, J; Cantet3, J;
Balzarini2, M
Predicting the performance of untested single
crosses is important in hybrid breeding programs, the number of possible
combinations to be tested and the cost involved make it impossible to evaluate
all new inbreds in most situations. The traditional fixed linear model, coupled
with ordinary least squares estimation used for most plant breeders is too
restrictive because of the independence assumption. Error structure is often
more complex that the one used in standard linear models (Balzarini, 2002). In
contrast, the general linear mixed model (Henderson, 1984) can easily
accommodate covariances among observations. The inclusion of numerator matrix
generates unbiased heritability estimations when maximum likelihood methodologies are used (ML,
REML and Bayes), mainly because it takes account of the correlation between
observations due to covariance between relatives and the variation due to
genetic drift, which is important
in finite populations under selection (Sorensen & Kennedy, 1983).
The objective of this study was to analyze the
effectiveness of best linear unbiased prediction (BLUP) based on molecular
(microsatellite) marker data.
Field
data was obtained from Nestares et al. (1999): topcrosses between a collection of
48 inbred lines and four tester populations (sB73 & sMo17 from the Reid x Lancaster pattern and HP3 & P5L2
from the local orange flint pattern) were evaluated for grain yield during the
1991/92 season at four environments. All lines but two (B73 and Mo17)
were orange flint germplasm developed by INTA from twenty different sources
(synthetics, composites, landraces, planned crosses and a commercial hybrid).
Molecular data were obtained from the
characterization of twenty-six (26/48) parent lines and the four testers
populations using 21 microsatellite markers evenly distributed in the genome
(Morales Yokobori et al. 2005).*
Relatedness
(r) between parents was estimated using MER (Moment Estimate of Relatedness)
software (Wang, 2002). r= 2 Q ,
where Q ,the
coefficient of coancestry, is the probability that, for any
autosomal locus, a random gene taken from individual x is identical by descent with a random gene
taken from individual y.
Three different
variance-covariance structures were compared using molecular and/or pedigree
data and under the following linear mixed model (Henderson, 1984):
![]()
Where: y is the response vector
(yield data of hybrids derived from the crosses between lines and testers), X,
Zl, Zt, Zd and Zge are known design
matrices. b is the vector of fixed parameters and al, at, d are vectors of random
effects associated to additive effects of lines, additive effects of testers
and dominance effects respectively. e is the vector of residuals. (ge) is a random effects
vector associated to genotype-environment interaction , for the sake of
simplicity we assumed that Cge, the covariance matrix
for (ge) is an identity matrix (no
correlation between interactions).
Residuals were also considered independent.
Assumptions
regarding relatedness between parents allows to define particular structures to
covariance matrices Al, At and D:
All models were evaluated posdictively by
restricted likelihood (resLL) and Akaike's information criterion (AIC)
(table1).
Cross-validation statistics were calculated to assess and compare the
predictive ability of some of the proposed models. For each genetic model, the
set performance of m missing crosses was predicted based on the formula
(Balzarini, 2002):
![]()
Where yM = m x 1 vector of predicted yields
of missing crosses, yP a p x 1 vector of average yields of predictor
hybrids, C m x p matrix of genetic covariances between missing and predictor
hybrids and V (p x p) phenotypic
variance-covariance matrix among the predictor hybrids. Due to the structure of
the crosses, for the leaving out one procedure used in this study we delete
data of four hybrids derived from one missing line, the crosses between the
line and the four testers, and predicted it on basis of the (25 x 4) remnant
hybrids (m=4, p=100). Effectiveness of prediction was measure by Spearman
correlation (table 2).
Conclusions
Inclusion of numerator matrix
(using pedigree or molecular data) generates more precise variance estimates
and higher values of heritability when compared with traditional fixed effects
models.
Results suggest that molecular
data, used in these types of crosses (parental populations genetically
divergent) did not any provide additional information to that provided by
pedigree data.
Bibliography
Balzarini, M—nica (2002) Aplications of mixed
models in plant breeding in CAB International ÒQuantitative Genetics, Genomics and Plant Breeding (ed M.S.
Kang)
Bernardo,
R (1996) Best linear unbiased prediction of maize single-cross performance
given erroneous inbred relationships Crop Sci. 36: 862-866.
Henderson, CR (1984) Applications of Linear
Models in Animal Breeding. University of Guelph.
Morales Yokobori, M; Decker, V and Ornella, LA
(2005) Analysis of heterotic maize (Zea mays L.) populations using
molecular markers. MNL 79:36.
Nestares, G; Frutos, E and Eyherabide, G
(1999) Evaluaci—n de l’neas de ma’z colorado por aptitud combinatoria. Pesq. agropec. bras. 34:1399-1406.
Sorensen, DA and Kennedy, BW (1983) The use of
the relationship matrix to account for genetic drift variance in the analysis
of genetic experiments. Theor. Appl. Genet. 66:217-220.
Wang, J (2002) An Estimator for Pairwise
Relatedness Using Molecular Markers. Genetics 160:1203-1215.
Table 1: Variance Analysis of proposed models
|
|
additive
variance |
dominance
variance |
GE variance |
error |
-2resLL |
AIC |
|
Model 1 |
s2l=3.23
s2t
=7.00 |
s2d=15.33 |
s2ge=3.87 |
s2e=151.36 |
6348.8 |
6358.8 |
|
Model 2 Pedigree |
s2l=3.38 s2t=14.00 |
s2d=15.47 |
s2ge=3.88 |
s2e=151.36 |
6348.7 |
6358.7 |
|
Model2 microsatellite |
s2l=3.23 s2t=11.11 |
s2d=15.60 |
s2ge=3.78 |
s2e=151.36 |
6348.6 |
6358.6 |
|
Model 3 Pedigree |
s2a=13.88 |
s2d=14.43 |
s2ge=3.83 |
s2e=151.36 |
6349.7 |
6357.7 |
|
Model 3 microsatellite |
s2a=12.34 |
s2d=14.59 |
s2ge=3.82 |
s2e=151.37 |
6349.3 |
6357.3 |
* Variance components
were estimated via restricted maximum likelihood (REML) using SAS (Sas
Institute) PROC MIXED.
**s2l additive variance due to parent lines, s2t additive
variance due to parent testers, s2a additive
variance of testers and lines (both groups belongs to the same population)
Table 2 Spearman Rank Correlation between observed (BLUP)
and predicted hybrid yields (model 2)
|
Population |
Pedigree data |
Microsatellite data |
|
26 lines |
0.40** |
0.36** |
|
lines derived from synthetics |
0.45* |
0.44 |
|
lines derived from composites |
0.52** |
0.49** |
|
Lines unrelated or highly divergent |
0.08 |
0.02 |
*
Indicates significance at P = 0.05.
**
Indicates a significance at P =0.01.