1Pergamino,  Argentina.

EEA  INTA Pergamino.

2Cordoba, Argentina

Estad’stica y Biometr’a (FCA-UNC).

3Buenos Aires, Argentina

Departamento de Producci—n Animal (FAUBA)

Prediction of maize (Zea mays L.) combining ability using molecular markers and mixed linear models theory

 

    -- Ornella1[ ], L; Eyherabide1, G; di Rienzo2, J; Cantet3, J; Balzarini2, M

 

mbalzari@agro.uncor.edu

 

 

Predicting the performance of untested single crosses is important in hybrid breeding programs, the number of possible combinations to be tested and the cost involved make it impossible to evaluate all new inbreds in most situations. The traditional fixed linear model, coupled with ordinary least squares estimation used for most plant breeders is too restrictive because of the independence assumption. Error structure is often more complex that the one used in standard linear models (Balzarini, 2002). In contrast, the general linear mixed model (Henderson, 1984) can easily accommodate covariances among observations. The inclusion of numerator matrix generates unbiased heritability estimations when maximum  likelihood methodologies are used (ML, REML and Bayes), mainly because it takes account of the correlation between observations due to covariance between relatives and the variation due to genetic drift, which  is important in finite populations under selection (Sorensen & Kennedy, 1983).

The objective of this study was to analyze the effectiveness of best linear unbiased prediction (BLUP) based on molecular (microsatellite) marker data.

 Field data was obtained from Nestares et al. (1999): topcrosses between a collection of 48 inbred lines and four tester populations (sB73 & sMo17 from the Reid x Lancaster pattern and HP3 & P5L2 from the local orange flint pattern) were evaluated for grain yield during the 1991/92 season at four environments. All lines but two (B73 and Mo17) were orange flint germplasm developed by INTA from twenty different sources (synthetics, composites, landraces, planned crosses and a commercial hybrid).

 Molecular data were obtained from the characterization of twenty-six (26/48) parent lines and the four testers populations using 21 microsatellite markers evenly distributed in the genome (Morales Yokobori et al. 2005).*

            Relatedness (r) between parents was estimated using MER (Moment Estimate of Relatedness) software (Wang, 2002).  r= 2 Q , where Q ,the coefficient of coancestry, is the probability that, for any autosomal locus, a random gene taken from individual x is identical by descent with a random gene taken from individual y.

Three different variance-covariance structures were compared using molecular and/or pedigree data and under the following linear mixed model (Henderson, 1984):

 

 

Where: y is the response vector (yield data of hybrids derived from the crosses between lines and testers), X, Zl, Zt, Zd and Zge are known design matrices. b is the vector of fixed parameters and al, at, d are vectors of random effects associated to additive effects of lines, additive effects of testers and dominance effects respectively. e is the vector of residuals. (ge) is a random effects vector associated to genotype-environment interaction , for the sake of simplicity we assumed  that  Cge, the covariance matrix for (ge) is an  identity matrix (no correlation between interactions). Residuals were also considered independent.

Assumptions regarding relatedness between parents allows to define particular structures to covariance matrices Al, At and D:

á       Model 1) Variance components, Parents unrelated.  Al, At and D are identity matrices.
á       Model 2) Lines and testers are derived from two different ancestral populations, so: Al={rij(l)} ,At={rij(t)} and D={dij=0.25 rij(l) rij(t)} (given hybrids i and j,  rij(l)  is the relatedness between parent lines  and  rij(t}  is the relatedness between  testers).
á       Model 3) Lines and testers are derived from the same ancestral population. al and at  can be combined in one vector a of additive effects of parents,  A={axy}, axy =relatedness between parents (lines and/or testers). D={dij=0.25(rij(ll)rij(tt)+rij(tl)rij(lt))}, rij(ll) } rij(xx) is the relatedness between parents of hybrids i and j; (xx): l stands for lines and t stands for testers.

 

All  models were evaluated posdictively by restricted likelihood (resLL) and Akaike's information criterion (AIC) (table1).

Cross-validation statistics were calculated to assess and compare the predictive ability of some of the proposed models. For each genetic model, the set performance of m missing crosses was predicted based on the formula (Balzarini, 2002):

 

 

Where yM = m x 1 vector of predicted yields of missing crosses, yP a p x 1 vector of average yields of predictor hybrids, C m x p matrix of genetic covariances between missing and predictor hybrids and V  (p x p) phenotypic variance-covariance matrix among the predictor hybrids. Due to the structure of the crosses, for the leaving out one procedure used in this study we delete data of four hybrids derived from one missing line, the crosses between the line and the four testers, and predicted it on basis of the (25 x 4) remnant hybrids (m=4, p=100). Effectiveness of prediction was measure by Spearman correlation (table 2).

 

Conclusions

Inclusion of numerator matrix (using pedigree or molecular data) generates more precise variance estimates and higher values of heritability when compared with traditional fixed effects models.

Results suggest that molecular data, used in these types of crosses (parental populations genetically divergent) did not any provide additional information to that provided by pedigree data.

 

Bibliography

Balzarini, M—nica (2002) Aplications of mixed models in plant breeding in CAB International  ÒQuantitative Genetics, Genomics and Plant Breeding (ed M.S. Kang)

Bernardo, R (1996) Best linear unbiased prediction of maize single-cross performance given erroneous inbred relationships Crop Sci. 36: 862-866.

Henderson, CR (1984) Applications of Linear Models in Animal Breeding. University of Guelph.

Morales Yokobori, M; Decker, V and Ornella, LA (2005) Analysis of heterotic maize (Zea mays L.) populations using molecular markers. MNL 79:36.

Nestares, G; Frutos, E and  Eyherabide, G (1999) Evaluaci—n de l’neas de ma’z colorado por aptitud combinatoria. Pesq. agropec. bras. 34:1399-1406.

Sorensen, DA and Kennedy, BW (1983) The use of the relationship matrix to account for genetic drift variance in the analysis of genetic experiments. Theor. Appl. Genet. 66:217-220.

Wang, J (2002) An Estimator for Pairwise Relatedness Using Molecular Markers. Genetics 160:1203-1215.

 

 

 

 

 

 

 

 

 

 

Table 1: Variance Analysis of proposed models 

 

additive variance

dominance variance

GE

variance

error

-2resLL

AIC

Model 1

s2l=3.23

 s2t =7.00

s2d=15.33

s2ge=3.87

s2e=151.36

6348.8

6358.8

Model 2

Pedigree

s2l=3.38   s2t=14.00

s2d=15.47

s2ge=3.88

s2e=151.36

6348.7

6358.7

Model2

microsatellite

s2l=3.23   s2t=11.11

s2d=15.60

s2ge=3.78

s2e=151.36

6348.6

6358.6

Model 3

Pedigree

s2a=13.88

s2d=14.43

s2ge=3.83

s2e=151.36

6349.7

6357.7

Model 3

microsatellite

s2a=12.34

s2d=14.59

s2ge=3.82

s2e=151.37

6349.3

6357.3

* Variance components were estimated via restricted maximum likelihood (REML) using SAS (Sas Institute) PROC MIXED.

**s2l  additive variance due to parent lines,  s2t additive variance due to parent testers, s2a   additive variance of testers and lines (both groups belongs to the same population)

 

           

 

 

 

Table 2 Spearman Rank Correlation between observed (BLUP) and predicted hybrid yields (model 2)

Population

Pedigree data

Microsatellite data

26 lines

0.40**

0.36**

lines derived from synthetics

0.45*

0.44

lines derived from composites

0.52**

0.49**

Lines unrelated or highly divergent

0.08

0.02

 

* Indicates significance at P = 0.05.

** Indicates a significance at P =0.01.

 

 

 

 

 

 

 

 



[ ] Present adress: Area Comunicaciones(FCIA-UNR)  Rosario, Argentina

 

* We have some problems in the molecular characterization of  the testers HP3 and P5L2, however we keep the data considering the robustness of blup predictors  (Bernardo,1996)