doi:10.2478/v10014-010-0017-x
COBISS: 1.01 Agris category code: U10
SIMPLE REPARAMETERIZATION TO IMPROVE CONVERGENCE IN LINEAR MIXED MODELS
Gregor GORJANC Tina FLISAR 2, Jose Carlos MARTÍNEZ-ÁVILA 3, Luis Alberto GARCÍA-CORTÉS 3
Received October 08, 2010; accepted December 01, 2010. Delo je prispelo 08. oktobra 2010, sprejeto 01. decembra 2010.
Simple reparameterization to improve convergence in linear mixed models
Slow convergence and mixing are one of the main problems of Markov chain Monte Carlo (McMC) algorithms applied to mixed models in animal breeding. Poor convergence is to a large extent caused by high posterior correlation between variance components and solutions for the levels of associated effects. A simple reparameterization of the conventional model for variance component estimation is presented which improves McMC sampling and provides the same posterior distributions as the conventional model. Reparameterization is based on the rescaling of hierarchical (random) effects in a model, which alleviates posterior correlation. The developed model is compared against the conventional model using several simulated data sets. Results show that presented reparameterization has better behaviour of associated sampling methods and is several times more efficient for the low values of heritability.
Key words: statistics / mixed model / Bayesian analysis / McMC / reparameterization / convergence
1 INTRODUCTION
Mixed models are abundantly used in the field of animal breeding and genetics with the aim to infer genetic values of animals given some phenotypic and pedigree information (Henderson, 1984). In it simplest form the mixed model can be written as:
y = Xb + Za + e,	(1)
where y is a vector of phenotypes, b is a vector of effects
Enostavna reparametrizacija za izboljšanje konvergence linearnih mešanih modelov
Počasna konvergenca je eden največjih problemov uporabe metode Monte Carlo z Markovimi verigami (McMC) za mešane modele na področju genetike in selekcije domačih živali. Slaba konvergenca je v veliki meri posledica visoke posteriorne korelacije med komponentami variance in rešitvami za ravni pripadajočih vplivov. Predstavljamo enostavno reparametriza-cijo običajnega modela, ki izboljša lastnosti metode McMC in daje enake posteriorne porazdelitve parametrov modela kot standardni pristop. Reparametrizacija temelji na standardizaciji hierarhičnih (naključnih) vplivov v modelu, kar posledično spremeni posteriorne korelacije med parametri. Oba pristopa smo primerjali na večjem setu simuliranih podatkov. Rezultati kažejo, da reparametrizacija vodi do bolj učinkovitih metod McMC vzorčenja in je nekajkrat bolj učinkovita za analizo lastnosti z nizko heritabiliteto.
Ključne besede: statistika / mešani model / bayesovska analiza / McMC / reparametrizacija / konvergenca
like sex, breed, age, etc., a is a vector of individual additive genetic effects and e residual, p(e | a2)~ N(o,Ia2), while X and Z are design matrices linking effects to phenotypic records. Pedigree information is included in the model hierarchically with prior distribution of individual additive genetic values, p(a |A,a2)~ N(o, Aa2). Henderson (1972) developed the so called mixed model equations (2) to efficiently obtain joint solutions for b and a, where G=A a2 and R = Ia2 :
1	Univ. ofLjubljana, Biotechnical Fac., Dept. of Animal Science, Groblje 3, SI-1230 Domžale, Slovenia,, Ph.D., e-mail: gregor.gorjanc@bf.uni-lj.si
2	The same address as 1
3	Departamento de Mejora Genética, Instituto Nacional de Investigación Agraria, Carretera de La Coruña, km 7, 28040 Madrid, Spain, Ph.D.
G. GORJANC et al.
X T R-1X X T R-1Z	b I I ATR-1y
ZT R-1X ZT R-1Z + G-1 a I I ZT K'y
Use of mixed model equations assumes known variance components ca and ce . Standard procedure is to estimate these variance components using restricted maximum likelihood method (REML; Patterson and Thompson, 1971) and to use these estimates in mixed model equations (2) ignoring the error of estimation in variance components.
Another approach to statistical inference, Bayesian approach, treats inference of all model parameters jointly. Although conceptually very appealing, Bayesian approach leads to formulas that are computationally intractable. This can be avoided by sampling methods such as Markov chain Monte Carlo (McMC; e.g., Gelman, et al., 2004). Wang et al. (1993) showed how McMC methods can be used with linear mixed models in animal breeding applications. In the case of linear mixed models all McMC computations follow from the posterior distribution (3):
p(b, a, c My |R| 'exp(-i(y-Xb - Za f R '(y-Xb-Za). X (3) G 2 exp(- i a f G-1 a)
where prior distributions for and both variance components ca2 and2 ae were assumed uniform (e.g., Gelman
et al., 2004). Given that cra and a are a priori correlated due to the prior definition of a, the a posteriori correlation between them is expected to be high. This leads to high autocorrelation between consecutive samples, making McMC method inefficient. Autocorrelations can be really problematic with low or near zero values for some variance components (e.g. additive genetic variance). This is caused by the shrinkage of a towards zero and in a next round of sampling variance component will again be close to zero, which can make the sampler stuck for quite some time at the values near zero (Gelman et al., 2004).
Chib and Carlin (1999) proposed block sampling of some parameters in (2) to improve convergence. Autocorrelation has also been alleviated by the use of centered models (Gelfand et al., 1995), parameter expanded models (Liu and Wu, 1999; Gelman et al., 2003; Gelman, 2004) and data augmentation based models (Meng and van Dyk, 1997; van Dyk and Meng, 2001). These methods have been applied both to accelerate the Expectation-Maximization (EM) algorithm and to alleviate the autocorrelation of McMC algorithms. In this work a reparameterization will be employed where additive genetic values will be a priori uncorrelated with c a . This approach will be compared against the conventional model of Wang et al. (1993).
2 METHOD
Let us consider a simple animal model y = Xb + Za + e with the following distributional assumptions:
p(y | b, a, o2)~ N(Xb + Za, Io'„ ) p(a\A,o2 ) ~ N (o, Ao2 ) p(e|o2 )~ N(o, Io2e )
(4)
For this particular case and assuming uniform priors for b and both variance components, p(b) x const.,
pCa )x const., and p(cA )<x const., the equation (3) becomes:
MMACTAy(tr/ e*pAiy-Xb-ZLgy-XbA A
(5)
&)
1 exp a' A a
where n is the number of records and q the number of animals. Full conditionals of the posterior (5) can be sampled using the coefficient (left hand side) matrix of the mixed model equations (2), sums of squares, normal and scaled central deviates (Wang et al., 1993).
Here another approach is proposed, which alleviates the autocorrelation of samples from (5). It is based on the reparameterization of the model in the terms of a new augmented variable u, a = uca. Such a model has been already proposed by Foulley and Quaas (1995) in a heterogeneous variance EM-REML context. To simplify
2
the notation, ca is used instead of y ca , but the model
now^ = on&i ^zU cw rite? nyinhffiemo loowing HhtriTOtteha
assumptions:
p(y 1 b	) ~ NXb + Zua,,IK
p(u | A) ~ N (0, A), p(e K2)~ N ( o , I a )
2
(6)
The joint posterior distribution, assuming again uniform priors on b and both variance components, is:
I. 2 2 a ! 2 Vn ( (y-Xb-ZuajT(y-Xb-Zua
(7)
xPr
Note that in (7) variance component ca drops out from the last part, but ca comes in the sum of squares of residuals. The full conditional distributions for the levels of both b and a are univariate normal distributions as in the conventional model, but considering a = uca:
p{br | b-t u
2 2

a • e ' y j
Zc b --7
,f, ->J J Zt c ■ u
N	c	, t	(8)
IX
T .-1
u A " u
SIMPLE REPARAMETERIZATION TO IMPROVE CONVERGENCE IN LINEAR MIXED MODELS
?(«• 1 u-i,b'CT2'CT2'y)~ N1
I, (9)
where both S{ and c,j are closely related with the conventional mixed model (2) but modified as:
C =
( XT X X1 Zaa ZT Xaa ZT + A~We
f \Ty S = X y
(10)
y"ay
The full conditional distribution of ae can be sampled from scaled inverted chi-square distribution with n - 2 degrees of freedom as in the conventional model:
4r 2 |b, u,a\, y)~ (y - Xb - Zuoa ) (y - Xb - Zuaa )zn-2 . (11)
2
After some algebra the full conditional of aa is
P A 1 b, u, A, y exp
_ 2 u'U (y-X u' Z' Zu
b)
■a„ + a n
(12)
2
u Z Zu
from which a truncated normal distribution can be recognized when presented in terms of aa with mean
uTz (y-xb)
—Zi- variance u' ztzu
and truncation point at 0:
L ib, u, a 2 jy ) ~ TN{ — b), —TT, , 0 (13) , , tj 1 _ zr Zu —T Z' Z—
2
When the full conditional distribution of aa does not involve the neighbourhood ofzero, it is a scaled non-central xX distribution with 1 degree of freedom, with a scale parameter u T Z u and noncentrality parameter
2 _ uTZT (y-Xb)(y-Xb)T Zu
2 _	_ T _T__
2uT ZT Zúa,
A
a
1 b
y ) ~ U zjzux (u)-
(14)
For cases where the posterior distribution of a is close to zero, the Metropolis-Hastings algorithm with positive proposal can be implemented, where the natural logarithm of the conditional density derived from (12) is:
-2z<Ja +aa
lnfpAlb, u,ae, y	(15)
where T represents mean and p variance from (13).
3 APPLICATION
Seven simulated datasets were used to compare the length of burn-in period and Monte Carlo variance of the model y = Xb + Zucrs + e against the conventional sire model y = Xb + Zs + e. All datasets consisted of 10,000 records, 100 herds (b) and 500 unrelated sires (s). Records were randomly assigned to herds and sires, i.e., having on average 100 records per herd and 20 records per sire. True phenotypic variance was 100 and sire variances for each simulated case were: 0.25, 0.5, 1.25, 2, 3.75, 5, and 10.
Markov chain Monte Carlo method was implemented using Gibbs sampler for the full conditional distributions described in (8, 9, and 11), while Metropolis sampler was used for sampling from (15). The length of burn-in period was determined by the use of coupling argument (Johnson, 1996; García-Cortés et al., 1998), where the tolerance of difference between two chains for the sire variance component was set to 10-4. After the burn-in period, chains with 20,000 samples were produced. Monte Carlo error was calculated empirically after 50 replicates for each simulated dataset. Presented
Table 1: Average (± standard deviation obtained empirically from 50 replicates) burn-in length by model and true heritability (h2) Preglednica 1: Povprečna (± standardni odklon, pridobljen empirično iz 50 ponovitev) dolžina ogrevalne faze glede na model in dejansko vrednost heritabilitete (h2)
True h2	Conventional model		Reparametrized model
0.01	569.6 ± 266.1		9.8 ± 6.4
0.02	332.7 ±	165.2	8.4 ± 3.9
0.05	173.9 ±	37.1	7.8 ± 2.6
0.10	162.4 ±	41.2	7.8 ± 2.9
0.15	55.1 ±	5.8	6.8 ± 2.4
0.20	42.6 ±	2.7	7.4 ± 2.2
0.40	25.2 ±	3.6	8.4 ± 3.5
results show the rate of convergence in the terms of burn-in period (Table 1) and after burn-in period (Table 2) for the conventional model (4) and the new reparameterized model (6).
Reparameterization of the model resulted in substantial reduction in burn-in phase of McMC procedure (Table 1), especially with the low values of heritability. Inspection of trace plots (not shown) showed that in the case of low heritability values for additive genetic variance were very close to zero as well as individual additive genetic values, which is expected. However, conventional model was prone to stuck in that configuration, while reparameterized model more easily explored wider pa-
A
a
A
Z
2
Table 2: Posterior mean (± standard deviation obtained empirically from 50 replicates) for the component of variance between sires by model and true heritability (h2)
Preglednica 2: Posteriorno povprečje (± standardni odklon, pridobljen empirično iz 50 ponovitev) komponente variance med očeti glede na model in dejansko vrednost heritabilitete (h2)
True O"2	True h2	Conventional model	Reparametrized model
0.25	0.01	0.39 ± 0.03	0.38 ± 0.01
0.50	0.02	0.91 ± 0.03	0.98 ± 0.01
1.25	0.05	1.45 ± 0.02	1.44 ± 0.01
2.50	0.10	1.69 ± 0.02	1.69 ± 0.01
3.75	0.15	4.39 ± 0.01	4.39 ± 0.01
5.00	0.20	6.02 ± 0.01	6.03 ± 0.01
10.00	0.40	13.03 ± 0.01	13.05 ± 0.02
rameter space, which in turn leads to faster convergence to stationary distribution (e.g., Gelman et al., 2004).
Both models gave the same posterior mean on average (Table 2) for variance between sires. Only results for this effect are reported as this is one of the parameters that are hard to accurately estimate in linear mixed models (e.g., Gelman et al., 2004). Posterior means for variance between sires were larger than the true value. This can be attributed to skewed posterior distributions for this effect. Monte Carlo variance obtained after 50 replicates of conventional analysis was sensitive to the value of the true heritability, while this was not the case for reparameterized model. In addition, Monte Carlo variance was higher with conventional model for heritabili-ties up to 0.1. More stable behaviour of reparameterized model was due to the possibility of easier escape from the neighbourhood of zero value for variance between sires. This means that reparameterized model is of a great value when traits with low heritability are analysed.
model - slightly modifying the mixed model equations according to (10) and using the Metropolis algorithm to sample from the full conditional density of cra .
Our procedure is very similar to the parameter expanded models presented in (Liu and Wu, 1999; Gelman et al., 2003; Gelman, 2004) among others for both the most frequent EM and Bayesian McMC. Their approach also standardizes the additive genetic values, but in terms of a = ua, where a represents an extra augmented variables in the model, while our approach standardizes breeding values with its hyper-parameter, i.e., oa. The data augmentation scheme presented here can be understood as a particular case of that presented in van Dyk and Meng (2001), which is based on linear transformations of random variables, such as y = Xb + Zp+ e, where p = Yu + y. In our case Y = Icr-1 and y = 0, is the simplest case having a significant reduction of the Monte Carlo variance.
Reparameterized model has been tested with a sire model example. Further research is necessary for animal models or multiple trait models (Henderson, 1984), where the amount of missing information may be higher causing more stringency in standard McMC samplers. In such cases reparameterization in terms of u is expected to provide even better results than presented here.
5 CONCLUSION
In summary, reparameterization of hierarchical effects resulted in a feasible Markov chain Monte Carlo algorithm that accelerates the convergence of the conventional sampling methods for Bayesian analysis of linear mixed models. This procedure requires a little programming effort for implementation by researchers who have experience with the conventional sampling methods.
4 DISCUSSION
The new data augmentation scheme resulted in an algorithm faster than the conventional Gibbs sampler for linear mixed models. Estimates for variance components do not suffer from getting stuck when visiting values close to zero and then the rate of convergence does not depend on the true value of heritability. When new model was applied to data sets with small heritability, Monte Carlo variance was around five times smaller. Therefore, the new model needs about twenty five times shorter chains to get the same Monte Carlo variance as the conventional model ofWang et al. (1993). The new model can be easily implemented in existing programs for the conventional
6 REFERENCES
Chib S., Carlin B.P. 1999. On McMC sampling in hierarchical longitudinal models. Statistics and Computing, 9, 1: 17-26 van Dyk D.A., Meng X.L. 2001. The art of data augmentation (with discussion). Journal of Graphical and Computational Statistics, 10, 1: 1-111 Foulley J.L., Quaas R.L. 1995. Heterogeneous variances in Gaussian linear mixed models. Genetics, Selection and Evolution, 27,3: 211-228 García-Cortés L.A., Rico M., Groeneveld E. 1998. Using coupling with the Gibbs sampler to assess convergence in animal models. Journal of Animal Science, 76, 2 : 441-447 Gelfand A.E., Sahu S.K., Carlin B.P. 1995. Efficient parameteri-sations for normal linear mixed models. Biometrika, 82: 479-488
Gelman A. 2004. Parameterization and Bayesian model-
ling. Journal of American Statistical Association, 99, 466: 537-545
Gelman A., Carlin J.B., Stern H.S., Rubin D.B. 2004. Bayesian
data analysis. Chapman & Hall / CRC, 2 edition Gelman A., Huang Z., van Dyk D.A., Boscardin W.J. 2003. Transformed and parameter-expanded Gibbs samplers for multilevel linear and generalized linear model. Technical report, Departament of Statistics. Columbia University Henderson C.R. 1972. Sire evaluation and genetic trends. In: Proceedings of the Animal Breeding and Genetics Symposium in Honor of Dr. J.L. Lush, Champaign, 29 jul. 1972. ASAS, ADSA, PSA: 10-41 Henderson C.R. 1984. Applications of Linear Models in Animal
Breeding. Guelph, University of Guelph Johnson V.E. 1996. Studying convergence of Markov chain
Monte Carlo algorithms using coupled sample paths. Journal of American Statistical Association, 91, 433: 154-166
Liu J.S., Wu Y. 1999. Parameter expansion for data augmentation. Journal of American Statistical Association, 94, 448: 1264-1274
Meng X.L., van Dyk D.A. 1997. The EM algorithm - an old folk-song sung to a fast new tune (with discussion). Journal of Royal Statistical Society, B Statistical Methodology, 59, 3: 511-567
Patterson H.D., Thompson R. 1971. Recovery ofinter-block information when block sizes are unequal. Biometrics, 58, 8: 545-554
Wang C.S., Rutledge J.J., Gianola D. 1993. Marginal inferences about variance components in a mixed linear model using Gibbs sampler. Genetics, Selection and Evolution, 25, 1: 41-62