doi:10.2478/v10014-011-0025-5 COBISS: 1.01 Agris category code: L01 ANALYSIS OF LONGEVITY IN SLOVENIAN HOLSTEIN CATTLE Klemen POTOČNIK Vesna GANTNER 2, Jurij KRSNIK Miran ŠTEPEC Betka LOGAR 3, Gregor GORJANC 1 Received August 20, 2011; accepted September 22, 2011. Delo je prispelo 20. avgusta 2011, sprejeto 22. septembra 2011. Analysis of longevity in Slovenian holstein cattle The longevity of Slovenian Holstein population was analysed using survival analysis with a Weibull proportional hazard model. Data spanned the period between January 1991 and January 2010 for 116,200 cows from 3,891 herds. Longevity was described as the length of productive life - from first calving till culling or censoring. Records above the sixth lactation were censored to partially avoid preferential treatment. Statistical model included the effect of age at first calving, stage of lactation within parity, yearly herd size deviation, season defined as year, herd, and sire-maternal grandsire (mgs). Some effects had time varying covariates, which lead to 1,839,307 or on average 16 elementary records per cow. Herd and sire-maternal grandsire effects were modelled hierarchically. Pedigree for sires and maternal grandsires included 2,284 entries. Estimated variance between herds was 0.12, while between sire variance was 0.04. Heritability was evaluated at 0.14. Genetic trend for sires was unfavourable, but not significant. A further research is needed to define the required number of daughters per sire and the dynamics of genetic evaluation for sires whose majority of daughters still have censored records. Key words: cattle / breeds / Slovenian Holstein / longevity / Weibull proportional hazards model 1 INTRODUCTION Longevity is a trait with great impact on dairy production economy and is, therefore, of considerable importance in dairy cattle breeding programmes (Charffed- Analiza dolgoživosti pri črno-beli pasmi goveda v Sloveniji Za analizo dolgoživosti smo pri slovenski črno-beli populaciji govedi uporabili metodologijo analize preživetja in Weibullov model sorazmernih ogroženosti. V analizo smo vključili podatke 116.200 krav iz 3.891 čred skozi obdobje od januarja 1991 do januarja 2010. Dolgoživost je bila predstavljena kot doba produktivnega življenja, ki je definirana kot število od prve telitve do izločitve ali do datuma zajema podatkov za živali, ki so na ta datum bile še žive. Šesto in kasnejše laktacije smo okrnili na konec šeste laktacije, da smo omilili precenjenost boljših živali. V statistični model smo vključili vpliv starosti ob prvi telitvi, stadija laktacije ločeno za vsako zaporedno laktacijo, spreminjanje velikosti črede med leti, leto telitve, čredo, očeta in materinega očeta. Ravni nekaterih vplivov so časovno spremenljivi, kar povzroči, da smo v analizi obravnavali 1.839.307 zapisov ali povprečno 16 osnovnih zapisov na kravo. Čreda in vpliv očeta z materinim očetom sta bila v model vključena hierarhično. Rodovnik za očete in materine očete je obsegal 2.284 zapisov. Ocenjena varianca za vpliv črede je znašala 0,12, medtem ko je ocena variance med očeti znašala 0,04. Dednostni delež je bil ocenjen na 0,14. Genetski trend ima negativno smer, a ni statistično značilen. Potrebne bodo nadaljnje raziskave, da bomo določili zadostno število hčera po biku in dinamiko obračunov plemenskih vrednosti za bike, ki imajo večino hčera še v fazi prireje. Ključne besede: govedo / pasme / slovenska črno-bela pasma / dolgoživost / Weibullov model sorazmernih ogroženosti dine et al., 1996; Strandberg and Soelkner, 1996). With the increase of longevity, the proportion of mature cows that produce more milk increases. For example, Strandberg (1996) estimated that an increase in longevity from three to four lactations increases average milk yield per 1 Univ. of Ljubljana, Biotechnical Fac. Dept. of Animal Science, Groblje 3, SI-1230 Domžale, Slovenia 2 J.J. Strossmayer Univ. in Osijek, Fac. of Agriculture, Trg Svetog Trojstva 3, 31000 Osijek, Croatia 3 Agricultural institute of Slovenia, Hacquetova 17, SI-1000 Ljubljana, Slovenia Acta argiculturae Slovenica, 98/2, 93-109, Ljubljana 2011 K. POTOČNIK et al. lactation and profit per year between 11 and 13%. In addition, improvement in longevity decreases replacement costs and somewhat increases selection intensity. There are several ways to implement selection on longevity in the breeding goal, directly or indirectly. Direct longevity can be represented as the length of (productive) life (LPL) or stayability. In cattle breeding LPL is usually defined as the elapsed time between the first calving and culling, while stayability is defined as a binary trait that measures cow survival (live or culled) at a certain point in time. The use of LPL is preferred since stayability as a discrete trait provides less information. Unfortunately, LPL, as well as stayability, can be quantified only after the cows are culled, though both approaches provide partial information when cow survives to the next "period" in life. Therefore, the information on the longevity of daughters of a sire becomes available with the increasing age of a sire. This inherently leads to the prolonged generation interval. Low heritability for longevity (Short and Lawlor, 1992; Vollema and Groen, 1996) induces unreliable estimation of breeding values (BV) based only on the information of parents or grandparents. Due to long generation interval, breeding programmes also include indirect measures of longevity via correlated traits such as fertility, health, and conformation traits (Burnside et al., 1984). Additional gain is due to the fact that the data on these indirect traits can be collected relatively early in the life of a cow. Nonetheless, both representations of longevity (direct and indirect) have a merit in a modern breeding goal (Essl, 1998). Analysis of indirect representations of longevity is to a large extent done with a standard linear model based on the Gaussian (normal) distribution. Specific approach is needed for a proper analysis of the LPL, due to the presence of live animals at the time of analysis (censored records) and changes in culling criteria over the productive life of cows (time varying covariates) (e.g. Ducrocq et al., 1988a). Exclusion of censored records from the analysis, or treating them as uncensored leads to biased results (Ducrocq, 1994). Additionally, relationship between longevity and its effects is rather multiplicative than additive (e.g. Ducrocq et al., 1988a). Survival analysis can handle this kind of data. In the last years several countries introduced direct longevity in the routine genetic evaluation of cattle and most of them use the Weibull proportional hazard model (INTERBULL, 2009), which represents a class of models in the field of survival analysis. Other statistical approaches (models) can also be used, but proportional hazard model have better properties (e.g. Caraviello et al., 2004; Jamrozik et al., 2008; Potocnik et al., 2008). The aim of this study was to present the results of genetic evaluation for the length of productive life in Slovenian Holstein population using a Weibull proportional hazard model. 2 MATERIAL AND METHODS 2.1 DATA Raw data for 126,716 Slovenian Holstein cows born from 1982 to 2008 were provided by the Agricultural Institute of Slovenia. In order to use old data but to avoid modelling the data up to the year 1991, the truncation date was set at January 1st 1991. On the other side, the date of last data collection was January 29th 2010. For cows alive at that time longevity was treated as right censored. Longevity was defined as the length of productive life (LPL) and was calculated as the number of days from the first calving to culling (uncensored/complete records) or to the moment of data collection (incomplete/censored records). The LPL of cows surviving beyond the sixth lactation was also censored in order to avoid the effect of preferential treatment and to focus on early culling in the life of a cow. Cows with missing or inconsistent data within the defined limits were removed (29,252 cows): culling before the date of truncation, calving date after the date of culling, no information for 600 days after calving, missing data for the first three lactations, daughters of sires with less than 20 daughters, and missing covariate or factor data. The structure of used data and descriptive statistics are given in Table 1. Altogether LPL for 116,200 cows from 3891 herds were used in the analysis. Cows in the analysis were daughters of 707 sires, while the whole sire-maternal grandsire pedigree consisted of 2,284 sires. Cows were on average culled in the third lactation, which Table 1: Structure of data and descriptive statistics (± standard deviation) Preglednica 1: Struktura podatkov in opisna statistika (± standardni odklon) Cows, no. 116,200 Sires, no. 707 Pedigree, no. 2,284 Censored records, % 41.0 Number of lactations in life uncensored records 3.0 censored records 3.0 Length of productive life, days uncensored records 1,095 ± 660 censored records 1,129 ± 754 94 Acta agriculturae Slovenica, 98/2 - 2011 ANALYSIS OF LONGEVITY IN SLOVENIAN HOLSTEIN CATTLE amounted to 1,095 days of productive life. Percentage of censored records was 41.0%. Censored records had about the same means, but larger variability. 2.2 SURVIVAL ANALYSIS Weibull proportional hazards model was used for the analysis of LPL. This model is built upon the Weibull distribution, whose density (1) and hazard (2) function for the ¿-th record t are: f(t. | A, p) = Ap(At)p-1 exp(-(At)p), h(t. | A, p) = Ap(At)p, (1) (2) where X (scale) and p (shape) are strictly positive parameters. In proportional hazard model it is assumed that the baseline hazard function changes proportionally with change in covariate(s) or factor levels. For the analysis of LPL the hazard function was modelled as: h(t..,, | X, p, else) = v ijklmnop 1 ' ' h„(t..,, | X, p) exp(c. + l. + y + h + d + s + 1/2s ), (3) 0 v ¿jklmnop' r ' j ' k l m n o/ v/ where: h(t ¡jktmnop I X, P, else) = hazard of culling p-th cow given other parameters, h (t,,, 0 s ijklmnop yk h s + 1/2s | A, p) = baseline Weibull hazard function (2), = ¿-th age at first calving: 0 (unknown) and from 19 to 50 months, = j-th lactation stage (1-60 days, 61-150 days, 151-270 days, 271-days till drying, and dry period) within parity - altogether 30 levels; time varying factor, = k-th season defined as year (1990-2010); time varying factor, = l-th herd (3891 levels); time varying factor, = m-th herd size deviation in comparison to previous year (< -70%, (-70%, -40%], (-40%, -10%], (-10%, 10%], (10%, 40%], (40%, 70%], and > 70%); time varying factor, = n-th sire and the o-th maternal grandsire (onwards both effects are termed sire effect) of the p-th cow. Levels of time varying factors (lactation stage within parity, year, herd, and herd size deviation) changed with cow "status" changes in time creating subsequent elementary records, while levels for others effects were constant over whole lifetime of a cow. Altogether, there were 1,839,307 elementary records. Herd and sire effects were modelled hierarchically: log-gamma distribution for herd effect and multivariate normal for sire effect with additive genetic covariance matrix build from the pedigree. The used Weibull proportional hazards model and the corresponding assumptions can be sketched in matrix form as: y | b, h, s, p ~ Weibull (Xb + Zh + Ws, p), h | y ~ Log - Gamma (y, y), s | G ~ Normal (0, G), where: (4) (5) (6) b = a vector with intercept p ln (y)) and parameters for the following effects: age at first calving, stage of lactation within parity, year, and the deviation of herd size from year to year, h = the vector of parameters for herd effect, s = the vector of parameters for sire effect, y = Log-Gamma distribution parameter, G = additive genetic covariance matrix among sires - a product of numerator relationship matrix between sires A and additive genetic variance between sires (ff^). Heritability according to the model (3, 4-6) was calculated following Meszaros et al. (2010): h2 = 4