Metodoloski zvezki, Vol. 10, No. 1, 2013, 1-16 Proposition of a Hybrid Stochastic Lee-Carter Mortality Model Agnieszka Rossa1 and Leslaw Socha2 Abstract In the paper, a stochastic hybrid mortality model (EHLC) treated as a solution of stochastic differential equations is introduced. The model is defined analogously to the well-known Lee-Carter mortality model (LC). A parameter estimation procedure including a switching rule are proposed. A comparison of the predictive accuracy of the LC and EHLC models based on the mortality data for Poland has shown that the new model yields better results. 1 Introduction The analysis of mortality is one of the basic problems faced by mathematical demography. First mortality models appeared in the literature in the 19th century (see Gompertz 1825, Thiele 1872). Since that time, much work has been done to develop the methodology. Complex mortality models have recently been the subject of several research papers authored, for instance, by Booth (2006), Cairns et al. (2008), Plat (2009), Haberman and Renshaw (2011). Mortality models can be divided into two main groups. The larger group consists of static and stationary models (see, e.g. Cairns et al. 2008, Cairns et al. 2009, Cairns et al. 2011, Renshaw and Haberman 2003a, 2003b, 2006, Haberman and Renshaw 2008, 2009, 2011, Hatzopoulos and Haberman 2011, Pitacco 2004, Luciano et al. 2008, Plat 2009), where the log-odds of death probabilities or mortality rates are often expressed in a parametric form as the functions of age and calendar time. The second group encompasses dynamic models, where death probabilities or mortality rates are expressed as solutions of stochastic differential equations without jumps (see for instance Russo et al. 2011, Bayraktar et al. 2009, Luciano et al. 2008, Dahl 2004, Schrager 2006, Janssen and Skiadas 1995, Giacometti et al. 2011, Biffis 2005) or with jumps (Bravo and Braumann 2007, Bravo 2009, Coelho et al. 2010, Biffis 2005, Hainaut and Devolder 2008). When the performance of a mortality model is tested with empirical data, then its parameter estimates are found not to be constant in time - they depend on the period under study. For instance, parameter estimates based on the 1930-1950 time series may 1 Institute of Statistics and Demography, University of Lodz, Poland; agrossa@uni.lodz.pl 2 Department of Mathematics and Natural Sciences, Cardinal Stefan Wyszynski University in Warsaw, Poland; leslawsocha@poczta.onet.pl be significantly different from those obtained for the period 1960-1980 (Giacometti et al. 2011). It therefore seems reasonable to account for the changes the estimated parameter in the dynamic stochastic context by considering a switching rule. Similar problems have also occurred in other areas of research, including control theory, economics, biology or chemistry. The concept of a switching rule has been considered with respect to the stochastic dynamic hybrid (switched) systems (e.g. Liberzon 2003, Boukas 2005), which are defined as systems composed of several parametric structures described in deterministic or stochastic terms. These structures change depending on the switching rule. There is a whole range of studies on the estimation methodology applicable to stochastic dynamic hybrid (switched) systems (see, for instance Yin et al. 2002, 2003). Drawing on this methodology, we propose a new concept of stochastic hybrid mortality model. The paper is organized as follows. In Section 2, the standard Lee-Carter model and basic notations and definitions are introduced. In Section 3, the Lee-Carter model is considered as a solution of stochastic differential equations. Thereafter, in Section 4, its modified representation termed Extended Hybrid Lee-Carter model (EHLC) is proposed. The parameter estimation procedure and the definition of a switching rule are discussed in Section 5. The parameter estimates calculated for the LC and EHLC models with Polish mortality data and a comparison between the models are provided in Section 6. The last Section 7 contains concluding remarks. 2 The Lee-Carter model Before a new mortality model is discussed in the next sections, let us consider a mortality model proposed by Lee and Carter (Lee, Carter 1992). The model allows age-specific mortality rates to be forecasted based on long-term trends. For the purpose of introducing the Lee-Carter model, let us consider crude age-specific period mortality rates mx,t defined as follows (Cairns at el. 2009) Dxt mx,t = w£, x = 0,1 t = 1,2,..., T, (2.1) Nx , t where x = 0,1,..., w - index of one-year age groups, t = 1,2,... ,T - index of a calendar year under observation, Dx t - number of deaths in a year t (aged x last birthday), NVx t - death risk exposure (an average population size in a year t, aged x last birthday). The Lee-Carter model (LC) can be defined as ln mx,t = «x + bxkt + £x,t, (2.2) or equivalently mx,t = exp {ax + bxkt + £x,t}, (2.3) where kt - time parameters indexed by t = 1,2,..., T, ax, bx - age-specific parameters indexed by x = 0,1,..., w, £Xjt - random errors assumed to be iid, £xt ~ N(0, a|). To ensure the unique representation of (2.2) or (2.3), some additional constraints are imposed (see Lee, Carter 1992) in the following way T u i T y^kt = 0, bx = 1, so that ax = ln mx,t- (2.4) t=1 x=0 t=1 Parameters ax describe the general profile of mortality rates. Indices kt reflect the dominant temporal pattern in the decline of mortality (Tuljapurkar et al. 2000), while the coefficients bx express the tendency of mortality rates to change when kt changes. For example, small values of bx for some x indicate that mortality at x varies a little with changes in the general level of mortality kt (it is often the case at older ages). Parameters bx usually have the same sign, but in some cases their movement in the opposite directions is also possible. Consequently, the Lee-Carter model assumes that mortality rates move in tandem, but not in the same direction or by the same amount. The performance of the Lee-Carter model has been covered in many studies (see, e.g. Lee, Miller 2001). According to the concept underlying this methodology, parameters ax, bx, kt are estimated with the empirical data on the considered population. Lee and Carter have used in their paper the Singular Value Decomposition (SVD) method (Lee, Carter 1992). Two other methods of estimation are also proposed, i.e. Weighted Least Squares (WLS) and the Maximum Likelihood (ML) approach (see Wilmoth 1993, Brouhns et al. 2002a,b). The estimated values of kt form a time series, with one value for each year. Because of that, the statistical methods of time series modelling can be employed. Lee and Carter considered a stochastic model of random walk with a (negative) drift. Its discrete representation is the following kt = kt_i + c + et, (2.5) where c is a constant drift, and et, t = 1, ...,T are independent, normally distributed random errors. The projected kt, based on (2.5), and the estimates of ax, bx are used to forecast mortality rates and any other life-table characteristics, e.g. the expected remaining lifetime. The estimation of the drift c is based on the following formula c=(kT - ki)/(T - 1). (2.6) The variance estimator of the error term is defined by 1T = ^^E(kt - kt-i - ¿)2. (2.7) T - 1 t=2 It can be treated as the measure of uncertainty in forecasting of kt. 3 Lee-Carter model versus stochastic differential equations We will consider a stochastic process ix(t) representing a hazard rate for a person aged x at time t. The rate will be defined by means of the following stochastic differential equation d^x(t) = (ax(t) + 1 0 represent the age-specific volatility parameters. Using the Ito formula we can express (4.1)-(4.2) in the form (see e.g. Oksendal 1995) lnix(t) = ax + bxk(t) + 2 (qX - t— y*)2 > (5.3) t=1 where y* is a mean of the sequence {y*,t} over t. In order to estimate the remaining parameters, bx and q* we use the nonlinear Least Squares method, i.e. we minimize the sum lnm*,t — ^lnm*,t_1 + b*[k(t) — k(t —1)] + 2 (q* — using k(t), o* obtained in the previous steps. (5.4) 6 The results of model estimation 6.1 Estimation of LC and EHLC models for Poland To illustrate the model estimation results, we estimated the LC and EHLC models using the 1958-2000 mortality data on Poland based on the period age-specific death rates available from www.mortality.org (Human Mortality Data Base). Figure 1 presents the estimates of ax7s in the LC model defined in (2.4). Estimates of bx 's in the LC and EHLC models are very close to each other, thus they have been plotted together in Figure 2. Figure 3 presents the estimates of k derived for the LC model and the forecasts to 2012 based on (2.5). The mortality regimes Is, s = 1, 2, 3 in years, 1958-1966,1967-1990, and 1991-2000 (males), and 1958-1966, 1967-1988, 19892000 (females) have been also determined, and the parameter estimates of the function (4.6) derived. The estimated functions k(t) are plotted in Figures 4 and 5. Figures 6 and 7 show the estimates of a^ and qX. Figure 1: Estimates of ax (males and females). 0,10 x -0,04 - ---Females -Meies Figure 2: Estimates of bx (males and females). Figure 3: Estimates of kt in the LC model and their forecasts to 2012 (males and females). Figure 4: Estimates of kt in the LC model and k(t) in the EHLC model (males). Figure 5: Estimates of kt in the LC model and k(t) in the EHLC model (females). X --- Females -Males Figure 6: Estimates of aX (males and females). 0,5 -| 0,4 0,3 X Females -Males Figure 7: Estimates of q2 (males and females). 6.2 Goodness-of-fit measures for the LC and EHLC models To compare the quality of the LC and EHLC models, two well-known measures of good-ness-of-fit were applied, i.e. the Mean Squared Error and the Mean Absolute Deviation, and the ex-post errors were assessed for the period 2001-2009. The obtained results are summarized in tables 1 and 2. The Mean Squared Error and the Mean Absolute Deviation were estimated separately to each year t in both models. The measures were denoted MSE$LC\ MSE{tEHLC) and MAD(lc\ MAD(ehlc), respectively. Hence we have MSEt (LC) \ 1 104 [ln mx,t - (ax + bxkt)\ x=0 1 104 MADtLC) = — |ln mxt - (ax + bxkt) |, x=0 mseehlc) = \ 105 104 £ x=0 lnmx,t — lnmXit-1+bx[k(t) - k(t-1)] + - (qX-&X) 1 104 / i \ MAD(EHLC) = — XL lnmx,t — ^lnmx,- + bx[k(t) - k(t-1)] + - (qX - ^X) j 2 1 Table 1: Ex post comparison of the LC and EHLC by means of MSE Years t Males Females LC EHLC LC EHLC 2001 0,226 0,145 0,130 0,148 2002 0,220 0,139 0,164 0,158 2003 0,249 0,152 0,149 0,142 2004 0,254 0,152 0,156 0,133 2005 0,262 0,155 0,163 0,173 2006 0,255 0,172 0,166 0,168 2007 0,275 0,189 0,203 0,192 2008 0,308 0,224 0,190 0,171 2009 0,316 0,228 0,205 0,214 Table 2: Ex post comparison of the LC and EHLC by means of MAD Years t Males Females LC EHLC LC EHLC 2001 0,179 0,088 0,100 0,095 2002 0,182 0,097 0,126 0,106 2003 0,203 0,105 0,124 0,102 2004 0,214 0,115 0,130 0,098 2005 0,224 0,123 0,133 0,122 2006 0,225 0,141 0,139 0,122 2007 0,230 0,151 0,164 0,143 2008 0,256 0,180 0,160 0,135 2009 0,269 0,187 0,170 0,158 6.3 Comparison of residuals In this subsection we compare the quality of the LC and EHLC models using the residual analysis. Residuals in the LC model are defined as eX:t = ln mx,t - (ax + bxkt) , (6.1) whereas residuals in the EHLC model have the form imx,t — lnmx,t-i + bx[k(t) - k(t-1)] + 1 (qX - *X) ) • (6.2) 6x,i = lnmx,t- ^lnmx,t-i + bx[k(t) - k(t-1)] + 1 (ql - al In both models an assumption is embedded that standardized residuals are approximately independent standard normal variables. The simple way of testing this assumption is studying the so-called contour plots. They are plotted in Figures 8-11 for both models. The assumption implies that there should be a random pattern of negative (in white) and positive (in black) random errors. If the plot reveals some clustering of positive or negative residuals then the assumption becomes inappropriate. Thus, the contour plots are visual tests of the underlying assumption. We can see strong clustering of positive and negative residuals for the LC model (Figures 8, 10), whereas the contour plots look reasonably random in the case of the EHLC model (Figures 9, 11). However, closer inspection of Figures 9 and 11 reveals horizontal bands, which may suggest some genuine period effects. Figure 11: Contour plot of residuals in the EHLC model (females). 7 Concluding remarks In the paper a mortality model representing the family of stochastic differential equations is proposed. It is based on the Lee-Carter model (LC) treated as a solution of stochastic differential equations. The new stochastic model has been termed an Extended Hybrid Lee-Carter model (EHLC). The switching rule for mortality decline has been derived for the EHLC model from the statistical adaptive test developed by Janic-Wroblewska and Ledwina. Both EHLC and LC were applied to model age-specific mortality rates in Poland. The results have shown the EHLC model to perform better than a standard LC model regarding the ex-post predictive accuracy. Further research will seek to define a more complex switching rule and another representation of a mortality model as a solution of stochastic partial differential equations. Acknowledgement Both authors gratefully acknowledge that their research was supported by a grant from the National Science Centre under contract DEC-2011/01/B/HS4/02882. References [1] Bayraktar, E., Milevsky, M. A., Promislow, S. D. and Young, V. R. (2009): Valuation of mortality risk via the instantaneous Sharpe ratio: Applications to life annuities, Journal of Economics Dynamics and Control, 33, 676-691. [2] Biffis, E. (2005): Affine processes for dynamic mortality and actuarial valuations, Insurance: Mathematics and Economics, 37, 443-468. [3] Booth, H. (2006): Demographic forecasting: 1980 to 2005 in review, International Journal of Forecasting, 22, 547-581. [4] Boukas, E. K. (2005): Stochastic Hybrid Systems: Analysis and Design, Boston, Birkhauser. [5] Bravo, J. M. (2009): Modelling mortality using multiple stochastic latent factors, Proceedings of the 7th International Workshop on Pension, Insurance and Saving, Paris, 28-29 May, 1-15. [6] Bravo, J. M. and Braumann, C. A. (2007): The value of a random life: modelling survival probabilities in a stochastic environment, Bulletin of the International Statistical Institute, LXII, 5743-5746. [7] Brouhns, N., Denuit, M., Vermunt, J. K. (2002a): A Poisson log-bilinear approach to the construction of projected lifetables, Insurance: Mathematics and Economics, 31(3), 373-393. [8] Brouhns, N., Denuit, M., Vermunt, J. K. (2002b): Measuring the longevity risk in mortality projections, Bulletin of the Swiss Association of Actuaries, 2, 105-130. [9] Cairns, A. J. G., Blake, D. Dowd, K., Coughlan, G. D. and Epstein, D. (2011): Mortality density forecasts: An analysis of six stochastic mortality models, Insurance: Mathematics and Economics, 48, 355-367. [10] Cairns, A. J. G., Blake, D., Dowd, K., Coughlan, G. D., Epstein, D., Ong, A. and Balevich, I. (2009): A quantitative comparison of stochastic mortality models using data from England and Wales and the United States, North American Actuarial Journal, 13, 1-35. [11] Cairns, A. J. G., Blake, D. and Dowd, K. (2008): Modeling and management of mortality risk: a review, Scandinavian Actuarial Journal, 2-3, 79-113. [12] Coelho, E., Magalhaes, M. G. and Bravo, J. M. (2010): Mortality projections in Portugal, Proceedings of the Conference of European Statisticians, Working Session on Demographic Projections 28-30 April 2010, Lisbon, Portugal, EUROSTAT Series Forecasting demographic components: mortality, 1-11. [13] Dahl, M. (2004): Stochastic mortality in life insurance: market reserves and mortality-linked insurance contracts, Insurance: Mathematics and Economics, 35, 113-136. [14] Gompertz, B. (1825): On the nature of the function expressive of the law of human mortality and on a new mode of determining life contingencies, Philosophical Transactions of the Royal Society of London, Ser. A, CXV, 513-580. [15] Giacometti, R., Ortobelli, S. and Bertocchi, M. (2011): A stochastic model for mortality rate on Italian Data, Journal of Optimization Theory and Applications, 149, 216-228. [16] Haberman, S. and Renshaw, A. (2011): A comparative study of parametric mortality projection models, Insurance: Mathematics and Economics, 48, 35-55. [17] Haberman, S. and Renshaw, A. (2009): On age-period-cohort parametric mortality rate projections, Insurance: Mathematics and Economics, 45, 255-270. [18] Haberman, S. and Renshaw, A. (2008): Mortality, longevity and experiments with the Lee - Carter model, Lifetime Data Analysis, 14, 286-315. [19] Hainaut, D. and Devolder, P. (2008): Mortality modelling with Levy processes, Insurance: Mathematics and Economics, 42, 409-418. [20] Hatzopoulos, P. and Haberman, S. (2011): A dynamic parametrization modeling for the age-period-cohort mortality, Insurance: Mathematics and Economics, 49, 155-174. [21] Human Mortality Data Base, Accessed on www.mortality.org [22] Janic-Wroblewska A., Ledwina T. (2000): Data driven rank test for two-sample problem, Scandinavian Journal of Statistics, 27, 281-297. [23] Janssen, J. and Skiadas, C. H. (1995): Dynamic modelling of life table data, Applied Stochastic Models and Data Analysis, 11, 35-49. [24] Lee, R.D. and Carter, L. (1992): Modeling and forecasting the time series of U.S. mortality, Journal of the American Statistical Association, 87, 659-671. [25] Lee, R.D. and T. Miller, T. (2001): Evaluating the performance of the Lee-Carter method for forecasting mortality, Demography, 38, 537-549. [26] Liberzon, D. (2003): Switching in Systems and Control, Boston, Basel, Berlin, Birkhauser. [27] Luciano, E. Spreeuw, J. and Vigna, E. (2008): Modelling stochastic mortality for dependents lives, Insurance: Mathematics and Economics, 43, 234-244. [28] Oksendal, B. (1995): Stochastic Differential Equations, Berlin, Springer. [29] Pitacco, E. (2004): Survival models in a dynamic context: a survey, Insurance: Mathematics and Economics, 35, 279-298. [30] Plat, R. (2009): On stochastic mortality modeling, Insurance: Mathematics and Economics, 45, 393-404. [31] Renshaw, A. and Haberman, S. (2006): A cohort-based extension to the Lee - Carter model for mortality reduction factor, Insurance: Mathematics and Economics, 38, 556-570. [32] Renshaw, A. and Haberman, S. (2003a): Lee-Carter mortality forecasting with age-specific enhancement, Insurance: Mathematics and Economics, 33, 255-272. [33] Renshaw, A. and S. Haberman, S. (2003b): On the forecasting of mortality reduction factors, Insurance: Mathematics and Economics, 32, 379-401. [34] Russo, V., Giacometti, R., Ortobelli, S., Rachev, S. and Fabozzi, F. J. (2011): Calibrating affine stochastic mortality models using term assurance premiums, Insurance: Mathematics and Economics, 49, 53-60. [35] Rossa, A. (ed.) (2011): Analysis and Modeling of the Mortality in the Dynamic Context, University of Lodz Press, Lodz (in Polish). [36] Socha, L. (2008): Linearization Methods for Stochastic Dynamic Systems, Berlin, Springer. [37] Schrager, D. (2006): Affine stochastic mortality, Insurance: Mathematics and Economics, 38, 81-97. [38] Thiele, T. N. (1872): On a mathematical formula to express the rate of mortality throughout life, Journal of the Institute of Actuaries, 16, 313-329. [39] Tuljapurkar, S., Li, N. and Boe, C. (2000): A Universal Pattern of Mortality Decline in the G7 Countries, Nature 405, 789-792. [40] Wong, E. (1971): Stochastic Processes in Information and Dynamical Systems, New York, McGraw-Hill. [41] Willems, J. L., Aeyels, D. (1976): An equivalence result for moment stability criteria for parametric stochastic systems and Ito equations, International Journal of Systems Science, 7, 577-590. [42] Wilmoth, J. (1993): Computational methods for fitting and extrapolating the LeeCarter model of mortality change, Technical Report, Department of Demography, University of California, Berkeley. [43] Yin, G., Zhang, Q., Yang, H. and Yin, K. (2002): A class of hybrid market models: simulation, identification and estimation, Proceedings of the 2002 American Control Conference, Anchorage, May 8-10, 2571-2576. [44] Yin, G., Zhang, Q., Yang, H. and Yin, K. (2003): Constrained stochastic estimation algorithms for a class of hybrid stock market models, Journal of Optimization Theory and Applications, 118, 157-182. Appendix: Adaptive Test of Janic-Ledwina (JL) Let us assume that we observe a sequence of continuous random variables Ui,U2,...,UN. Each variable Ut is distributed according to a distribution function Ft. We will consider an adaptive statistical test proposed by Janic-Wroblewska and Ledwina (2000) to verify the null hypothesis H0 Ho : Fi = F = ... = Fn , against the alternative hypothesis H1 Hi : 3n£(o,i), Fi = F[Nn] = F[Nn+i] = • • • = Fn, where [Nn] denotes the integer part of the number Nn. The test statistics MN is defined as follows Mn(e,pN) = max T (S(m,pN),m), [eN]T (l, m)-1 • pN; l=1,...,dN} , dN > 0 - integer value representing the complexity of the problem (e.g. dN = 10), k 2 t=i Rt - rank of Ux,t (for each fixed x) cmt - weights defined as T(k, m) = ^^ L2(m, bn) n=i L(m, bn) = ^ Cmt • bn f Rt n0' 5 t=i N m(N-m) i N m' t = 1,2,...,m, cmt /^N-™) • , t = m + 1,...,N. \/ N N—m' ' ' bn, n = 1,..., k - the Legendre orthonormal polynomials on the interval [0,1]. Large values of MN support rejecting the null hypothesis in favor of the alternative. The authors have derived critical values of the test by using the Monte-Carlo methods. They have also proved that for k = 1 the JL test reduces to the well-known Wilcoxon rank test.