IMAD Working Paper Series http://www.umar.gov.si/en/publications/working_papers Arjana Brezigar Masten, Marko Glažar, Janez Kušar, Igor Masten1 Forecastng Macroeconomc Varables n Slovenia Using Dynamc Factor Models Working Paper No. 9/2008, Vol. XVII Abstract: In this paper we consider dynamic factor models of Stock and Watson (1998) for forecasting macroeconomic variables in Slovenia. Altogether, the results of this paper support the usefulness of dynamic factor models in applied forecasting in situations where a lack of data does not permit the application of complex large-scale structural models. Key words: forecasting, factor models, transition economies, short panels The Working Paper Series is intended for the publication of the findings of research work still in progress, the analysis of data series, and the presentation of methodologies in particular research areas. The aim of the series is to encourage the exchange of ideas about economic and development issues and to publish findings quickly, even if they are not fully conclusive. The opinions, findings, and conclusions expressed are entirely those of the authors and do not necessarily represent the views of the Institute of Macroeconomic Analysis and Development. The contents of this publication may be reproduced in whole or in part provided that the source is acknowledged. University of Ljubljana, Faculty of Economics, Ljubljana. IMAD Working Paper Series Publisher: Institute of Macroeconomic Analysis and Development Gregorčičeva 27 SI-1000 Ljubljana Slovenia Phone: (+386) 1 478 1012 Fax: (+386) 1 478 1070 E-mail: gp.umar@gov.si Editor in Chief: Barbara Ferk, MSc (barbara.ferk@gov.si) Working Paper: Forecasting Macroeconomic Variables in Slovenia Using Dynamic Factor Models Authors: Arjana Brezigar Masten (arjana.brezigar@gov.si); Marko Glažar, MSc (marko.glazar@gov.si); Janez Kušar (janez.kusar@gov.si); Igor Masten, PhD (igor.masten@ef.uni-lj.si) Language Editor: Amidas d. o. o. Peer reviewer: Timotej Jagrič, PhD Ljubljana, October 2008 CIP - Kataložni zapis o publikaciji Narodna in univerzitetna knjižnica, Ljubljana 519.862:338(497.4)(0.034.2) FORECASTING macroeconomic variables in Slovenia using dynamic factor models [Elektronski vir] / Arjana Brezigar Masten ... [et al.]. - Besedilni podatki. - Ljubljana : Institute of Macroeconomic Analysis and Development, 2008. - (Working paper series / IMAD ; 2008, 9) Način dostopa (URL): http://www.umar.gov.si/fileadmin/user upload/publikacije/dz/2008/dz09- 08.pdf ISBN 978-961-6031-76-9 1. Brezigar Masten, Arjana 241507328 CONTENTS 1 METHODOLOGY .............................................................................................................................................................. 1 1.1 An approximate dynamic factor model ....................................................................................................................... 1 1.2 Use of data in estimation of factors ............................................................................................................................ 3 1.3 Forecasting models .................................................................................................................................................... 3 1.4 Forecast comparison .................................................................................................................................................. 5 2 DATA ................................................................................................................................................................................ 6 3 FORECASTING PERFORMANCE ................................................................................................................................... 7 3.1 GDP growth ................................................................................................................................................................ 8 3.2 Growth of gross fixed capital formation ...................................................................................................................... 9 3.3 Growth of private consumption ................................................................................................................................. 10 3.4 Growth of exports ..................................................................................................................................................... 10 3.5 Growth of imports ..................................................................................................................................................... 11 3.6 Growth of industrial output ....................................................................................................................................... 12 3.7 The role of lagged variables in factor extraction ....................................................................................................... 12 3.8 The role of preselection of variables ........................................................................................................................ 13 4 CONCLUDING DISCUSSION ........................................................................................................................................ 15 Appendix A: In-sample forecasts for 2- and 3-quarter horizons ................................................................................. 20 Appendix B: Data set ....................................................................................................................................................... 26 Figure A.1: GDP growth, 2 quarters ahead .............................................................................................................................................. 20 Figure A.2: GDP growth, 3 quarters ahead .............................................................................................................................................. 20 Figure A.3: Growth of gross fixed capital formation, 2 quarters ahead .................................................................................................... 21 Figure A.4: Growth of gross fixed capital formation, 3 quarters ahead .................................................................................................... 21 Figure A.5: Growth of private consumption, 2 quarters ahead ................................................................................................................. 22 Figure A.6: Growth of private consumption, 2 quarters ahead ................................................................................................................. 22 Figure A.7: Growth of imports, 2 quarters ahead ..................................................................................................................................... 23 Figure A.8: Growth of imports, 2 quarters ahead ..................................................................................................................................... 23 Figure A.9: Growth of exports, 2 quarters ahead ..................................................................................................................................... 24 Figure A.10: Growth of exports, 3 quarters ahead ................................................................................................................................... 24 Figure A.11: Industrial production growth, 2 quarters ahead ................................................................................................................... 25 Figure A.12: Industrial production growth, 3 quarters ahead ................................................................................................................... 25 Table A. 1: List of series used in the quarterly data ................................................................................................................................. 26 i List of tables and figures Table 1: M SE relative to AR (rMSE) at 3 horizons.....................................................................................................................................7 Table 2: Median factor forecasts according to the number of lags included in extraction of factors (rMSE relative to AR)....................13 Table 3: MSE of best factor models relative to AR (rMSE) with and without pre-selection of variables at one-quarter-ahead forecast horizon......................................................................................................................................................................................................13 Figure 1 : Forecast of GDP growth one quarter ahead for the period 2004Q3 - 2007Q1...........................................................................8 Figure 2: Forecasting growth of gross fixed capital formation one quarter ahead for the period 2004Q3 - 2007Q1.................................9 Figure 4: Forecasting growth of private consumption one quarter ahead for the period 2004Q3 - 2007Q1...........................................10 Figure 5: Forecasting growth of exports one quarter ahead for the period 2004Q3 - 2007Q1................................................................11 Figure 6: Forecasting growth of imports one quarter ahead for the period 2004Q3 - 2007Q1................................................................11 Figure 7: Forecasting growth of industrial output one quarter ahead for the period 2004Q3 - 2007Q1..................................................12 ii Summary In this paper we consider dynamic factor models of Stock and Watson (1998) for forecasting macroeconomic variables in Slovenia, where available time series are relatively short and subject to structural change. Results reveal that factor models yield significant gains in forecasting precision relative to simple time series models. In addition, we consider two methodological modifications in factor estimation. The first is the inclusion of lagged variables, the second the preselection of variables. These modifications do not lead to better forecasting performance in general, but the results can be useful for certain variables and forecast horizons. iii Daljši povzetek v slovenskem jeziku V delovnem zvezku pri napovedovanju makroekonomskih spremenljivk za Slovenijo analiziramo uporabo dinamičnih faktorskih modelov Stocka in Watsona (1998). Prednosti uporabe dinamičnih faktorskih modelov pridejo na primeru Slovenije še bolj do izraza, ker so razpoložljive časovne serije relativno kratke oziroma so pogosto prisotni prelomi v serijah. Faktorski modeli namreč lahko do neke mere nadomestijo relativno kratkost časovnih serij z večjim številom serij. V pristopu Stocka in Watsona (1998) se za ocenjevanje faktorjev uporablja metodo glavnih komponent. Prvi korak v našem primeru je ocenjevanje faktorjev in faktorskih uteži. Za določitev števila faktorjev obstaja več metod, med drugimi tudi ocenjevanje s tremi informacijskimi kriteriji, ki sta jih predlagala Bai in Ng (2002). V našem primeru metoda predlaga vedno samo en faktor. Zato uporabimo drug pristop, iz podatkov izračunamo 12 faktorjev in potem v napovednih modelih uporabimo različno število faktorjev. Ob tem ocenjujemo tudi dve metodološki spremembi v izračunu faktorjev. Prva je uporaba odloženih spremenljivk v modelu, in sicer do treh odlogov. Druga pa je preselekcija oziroma uporaba predhodno izbranih serij glede na korelacijski koeficient z odvisno spremenljivko. Specifikacijo napovednih modelov povzemamo po Marcellino in drugi (2003) in Banerjee in drugi (2005): yt+h = ß + (%(L)yt + ß{L)'Zt + Lj+ h kjer je yth+h odvisna spremenljivka v času t+h, Zt vektor prediktorjev v času t, a(L) skalarni polinom odlogov, ß(L) vektorski polinom odlogov in µ konstanta. Za napovedni horizont v našem primeru vzamemo h = 1,…,3. Napovedni modeli, ki jih ocenjujemo, se razlikujejo glede na izbor Zt ter vključevanje odloženih odvisnih spremenljivk v model. V faktorskih modelih je Zt določen z ocenjenimi faktorji iz približnega dinamičnega faktorskega modela. Za vse modele z rekurzivno metodologijo izračunamo simulirane izven-vzorčne napovedi za obdobje po vstopu Slovenije v EU. Napovedno moč modelov primerjamo na podlagi relativnih povprečnih kvadratov napak (MSE - mean squared error) konkurenčnih modelov. Primerjalni (angl. benchmark) model je avtoregresijski, kjer je število odlogov določeno z BIC kriterijem. Podatkovna baza je sestavljena iz 60 četrtletnih serij za obdobje 1994Q1-2007Q1, od tega 19 za mednarodno okolje. V delovnem zvezku ocenjujemo napovedi za naslednje serije: bruto domači proizvod, bruto investicije v osnovna sredstva, zasebna potrošnja, uvoz, izvoz in industrijska proizvodnja v predelovalnih dejavnostih. Rezultati potrjujejo večjo napovedno moč faktorskih modelov glede na preproste modele časovnih vrst. Koristi faktorskih modelov se povečujejo z daljšanjem napovednega horizonta, kar je še posebej pomembno za praktično uporabo, saj se makroekonomske spremenljivke redko napoveduje le za eno četrtletje vnaprej. Za uporabo odloženih spremenljivk in preselekcije v izračunu faktorjev ne moremo govoriti o splošnem izboljšanju modelov, vendar lahko za posamezne spremenljivke in napovedne horizonte rezultate koristno uporabimo. iv 1 METHODOLOGY 1.1 An approximate dynamic factor model In this section we outline the generalized factor model. For a more detailed description of factor models, their estimation and use in forecasting, see Stock and Watson (1998) (hereinafter: SW). The main factor model used in the past to extract dynamic factors from economic time series has been the state space model estimated using maximum likelihood. This model was used in conjunction with the Kalman filter in a number of papers, for example SW (1991). However, maximum likelihood estimation of a state space model is not practical when the dimension of the model becomes too large, due to the computational costs. To solve this problem, SW (1998) have suggested principal component estimation. This method can accommodate a very large number of time series and can consistently estimate the factor space asymptotically (Kapetanios and Marcellino, 2003). The premise of the dynamic factor model is that the co-variation among economic time series variables at leads and lags can be traced to a few underlying unobserved time series or factors. The disturbances to these factors might represent major aggregate shocks to the economy, such as demand or supply shocks. Accordingly, dynamic factor models express observed time series as a distributed lag of a small number of unobserved common factors, plus idiosyncratic disturbances. Formally, in a dynamic r-factor model each element of the vector yt = [y1t,..., yNt] that is a stationary random variable can be represented as yit=^(L)ft+uit , (1) where i (L) = [i 1 ,...,Âir] L) and f t = [ f 1t ,...,f rt] . The vector u t = [ u 1t ,...,u Nt] comprises N idiosyncratic disturbances, ft is a vector of r common factors, and \(L) is vector lag polynomial, called "dynamic factor loading". If the lag polynomial \(L) is assumed to have a finite order q, (1) can be written as: yit=Aft+u j it , (2) in which there are s static factors consisting of the current and lagged values of r dynamic factors, and where a = [â1,...,ân]. The representation (2) is called the static representation of the dynamic factor model. Factors ft, loadings A and disturbances uit are unobserved, and uit are assumed to be a vector of uncorrelated errors with: E(ut) = 0 d E(utut) =L = diag(dl,...,(J2N) When it holds for the vector of common factors that E(ft) = 0 , E( ft ft ) = Çl and E(ftut) = 0 , we can talk about the strict factor model. Because ft and ut are uncorrelated at all leads and lags, the covariance matrix of yt,, zyy is the sum of two parts, one arising from the common factors and the other arising from the idiosyncratic disturbances: E(y' y) = 'Lyy = AL ff A' + Euu, (3) 1 where eff and zuu are the variance matrices of ft and ut. This is the usual variance decomposition of classical factor analysis. A dynamic factor model can be estimated by principal components. The starting point in SW’s approach is the estimation of factors and loadings. Under the assumption that the number of factors is known, they define the estimators of A, A, and Ft, Ft, by solving the nonlinear least squares problem: minfl,...,fT(NTy1YJyZ(yit-Aift)2 i=1 t=1 s.t. T ff = Ir (4) The estimated factor matrix f is simply vT times the eigenvectors corresponding to the r largest eigenvalues of the TxT matrix yy . Given f, the optimal estimators of A are the OLS estimators of the coefficients in a regression of yit on the estimated factors f : A = (T)-1fx^ (5) The estimates f could be rescaled so that: (N-1)A'A = r _ Approximate factor models are more general than strict factor models. First, they allow for weak serial correlation of idiosyncratic errors. Thus, the principal component estimator remains consistent if the idiosyncratic errors are generated by a stationary ARMA process. Second, the idiosyncratic errors may be weekly cross-correlated and heteroscedastic. Third, the model allows for weak correlation among factors and idiosyncratic components (Breitung and Eickmeier, 2005a). Finally, we have to discuss the determination of the number of factors. To determine the number of factors empirically, a number of criteria have been suggested. For the approximate factor model, Bai and Ng (2002) formulate the problem of estimating the number of factors as that of model selection, each model allowing for a different number of latent factors. They introduce three information criteria based on the residuals of the time series regressions of predictors on a given set of r factors corrected by a penalty term. By applying their method to our data it turns out that the suggested number of factors is always one. However, according to out-of-sample forecast results or eigenvalues distribution, the number of latent factors is certainly greater than one. Since we work with quite noisy data, the penalty term may not be appropriately scaled to the large residuals of the series’ regressions on the factors, irrespective of the factors used (Grenouilleau, 2006). Instead of using Bai and Ng (2002) criteria, we extract up to 12 factors from the data and combine them in a flexible way in a number of competing forecasting models. In this respect we consider models with fixed structure and model selection based on the BIC criterion. 2 1.2 Use of data in estimation of factors We address two issues in the estimation of factors. First, the factors can be extracted from an unbalanced panel of available time series or from a balanced panel, and we consider them both. In the results below we label these two cases with prefixes nbp and bp, respectively. The former contains more variables than the latter, and therefore more information. The drawback is that missing observations have to be estimated in the first stage, which could introduce noise in the factor estimation (see Angelini, Henry and Marcellino, 2004). In our application the missing observations are interpolated with the EM algorithm assuming a factor structure of the data. Second, in order to potentially improve the forecasting performance of factor models we consider two extensions. The first is preselection of variables, as suggested by Bovin and Ng (2006). Specifically, a larger longitudinal dimension improves the precision of the factor estimates when the additional variables are driven by the same factors. If the added variables, however, are driven by different factors, in particular if the latter have a low correlation with the target variable in the forecasting exercise, this can create serious problems (Banerjee, Marcellino and Masten, 2008). To by-pass this problem, Boivin and Ng (2006) suggested preselecting the variables to be used for factor estimation. Preselection of variables is based on their correlation with the target variable over the full sample. The threshold value of correlation coefficients in absolute value is 0.20 for GDP, private consumption, fixed investment, imports and inflation, 0.25 for industrial production and 0.30 for exports. Since the number and list of variables in each subset differ across forecasting variables and forecasting horizons, a new set of factors is estimated for each combination of the two (the maximum number of factors is set to 3). The second extension we consider in the computation of factors is the inclusion of lagged series. Such an approach has been advocated by Schneider and Spitzer (2004). Since the number of horizons we are observing is 3, we include up to 3 "new sets of series" - the original series lagged by 1 to 3 quarters. In order to estimate the effect more carefully we also test the model with the original series with one and two lag series added. The effects of such a modification on forecasting performance relative to the case where factors are computed without including lagged variables in the dataset are shown for each forecast variable in the sections to follow. 1.3 Forecasting models The specification of forecasting models follows Marcellino et al. (2003) and Banerjee et al. (2005). All models are specified and estimated as a linear projection of an /?-step-ahead variable, yth+h onto the f-dated vector of predictors Zt : yt+h = ß + (%(L)yt + ß(L)'Zt + Lj+ h (7) where a(L) is a scalar lag polynomial, ß(L) is a vector lag polynomial and µ represents a constant. In our empirical application we set the forecast horizon to /?=1,…,3. There are two main advantages of /?-step-ahead projection approaches (Marcellino et al., 2001). First, it eliminates the need for estimating additional equations for simultaneously forecasting Z(, e.g. by a vector 3 autoregression (VAR), and second, it reduces the potential impact of specification error in the one-step-ahead model (including the equations for Zt) by using the same horizon for estimation as for forecasting. All dependent variables are modelled as I(1), so that y respectively is the growth rate of industrial production, the rate of growth of GDP, the rate of growth of consumption, investment, exports and imports. The particulars of the construction of yth+h depend on whether the series is modelled as I(1) or I(2). In the I(1) case we have yth+h = ^=t+lAxs = xt+h - xt. yt+hh thus representing the change (growth rate in the case of variables in logs) in the series between time periods t andt + h. In the I(2) case, on the other hand, yth+h = Ets += ht+iAxs - hAxt or yth+h = xt+h -xt- hAxt, and yt = &xt . The forecasting models being considered differ in the choice of Zt. All the methods entail some model selection choices, in particular the number of autoregressive lags and the number of lags of predictor variables Zt to include in (2). Autoregressive forecast (ar_bic): Our benchmark forecast is a univariate autoregressive (AR) forecast, based on (7), excluding Zt. The lag length is chosen with the BIC criterion, with a maximum of 4 lags. Autoregressive forecast with second differencing (ar_bic_i2): Slovenia has gone through several economic and institutional changes. Some time series may thus suffer from structural breaks. Since second differencing of the variables might improve forecasting performance (Clements and Hendry, 2000), we also estimate the model (2), excluding Ft, treating yt as I(2). The lag length is chosen with the BIC criterion. Autoregressive forecast with intercept correction (ar_bic_ic): When structural breaks appear over the forecasting period, intercept correction could be useful. Adding past forecast errors to the forecast corrects the forecast in the right direction. Hence, the forecast is given by yth+h , where yth+h is the ar_bic forecast and e? is the forecast error made when forecasting yt in the period t-h. On the other hand, we should be aware of the fact that adding a moving average component to the forecast error increases the mean square error if the correction is not needed. (Clements and Hendry, 2000; Artis and Marcellino, 2001). Factor model forecast: These forecasts are based on setting Zt in (7) to be the estimated factors from an approximate dynamic factor model described above. We allow estimated factors to enter Zt in different ways (see also Banerjee et al., 2005). First, in addition to the current and lagged yt up to 4 factors and 3 lags of each of these factors are included in the model (fdiarlag_bic). Second, up to 12 factors are included, but not their lags (fdiar_bic). Third, up to 12 factors appear as regressors in (7), but no current or lagged yt is included (fdi_bic). For each of these three classes of factor-based forecasts the model selection is again based on BIC. Third, in order to evaluate the forecasting role of each factor, we also consider forecasts using a fixed number of factors, for an unbalanced and balanced panel, from 1 to 12 (fdiar_01 to fdiar_12 and fdi_01 to fdi_12). Finally, since there are many more versions of the factor forecasts than of each of the other competing models, to characterize the overall performance of the factor models we also constructed pooled factor forecasts by taking a simple average of all the factor-based forecasts. These pooled forecasts are then compared to the actual values of the series in the same way as for any other forecasting model. It is worth noting that the pooled factor forecasts have particular informative value. Since we consider many different versions of factor models it should not be surprising to find at least one model that forecasts better than 4 simple linear models. The average performance of factor models in this respect tells us whether factor models are in general a better forecasting device or if their relatively good performance is limited only to certain special sub-models. 1.4 Forecast comparison The forecast comparison of models was conducted in a simulated out-of-sample framework, where all statistical calculations were done using a fully recursive methodology. For the out-of-sample period we chose the period after the accession of Slovenia to the EU as the time window for the evaluation of pseudo out-of-sample forecasting performance. This means that in the first step, the models are estimated on data from 1994Q1 to 2004Q2 and /?-step-ahead forecasts (from 1 to 3) are then computed. In the next step, the sample is augmented by 1 quarter and the corresponding h-quarter-ahead forecast is computed. The forecast period is 2004Q3 to 2007Q1, so for the horizon of 1 quarter we have 11 pseudo out-of-sample forecasts, while for horizon 3, there are 9. The whole process of model estimation, standardization of data, calculation of estimated factors, etc. is repeated for each recursion. Forecasting performance of the various methods described is examined by the relative mean square forecast error (MSE). MSE compares the performance of a candidate forecast (forecast ;) to a benchmark forecast, where both are computed using the pseudo out-of-sample methodology. Specifically, let Yi h +hIt denote the pseudo out-of-sample forecast ofYt + h h, computed using data through time t, based on the Ith individual indicator. Let Y0 h t+h t denote the corresponding benchmark forecast made using autoregression. Then the MSE of the candidate forecast, relative to the benchmark forecast, is t=T Relative MSE = ^------------------------ ËY t + hh-Y 0ht + h J t=Tx where Ti and T2-h are respectively the first and last dates over which the pseudo out-of-sample forecast is computed. As explained above, we set Ti to 2002Q3 and 72 to 2007Q1. If the relative MSE of the candidate forecast is less than one, then the forecast based on that leading indicator outperformed the AR benchmark. West (1996) standard errors are computed around the relative MSE. 5 2 DATA The dataset contains 63 quarterly series for the period 1994Q1–2007Q1, 38 of which refer to Slovenia and 25 to the international environment. The main source of data is the Statistical Office of the Republic of Slovenia; other sources are the Bank of Slovenia, Eurostat and the Ministry of Finance of the Republic of Slovenia. The list of all series is given in Appendix B. The dataset comprises real output variables (GDP, components of GDP, industrial production), international trade variables (exports, imports), survey data (consumer and industrial confidence), labour market variables (employment, unemployment rate, wages), prices, interest rates and exchange rates. In principle we could have added many more series to our panel. However, many are unreliable due to statistical inconsistencies, such as changes in definition or capture and/or limited time series. For these reasons we decided to confine our dataset only those variables for which we are confident about their quality.2 The series for the international environment contains eurozone interest rates, prices in the eurozone, GDP in the eurozone and US, industrial production in the eurozone, and exports and imports of the eurozone and US. Factor analysis requires some pre-treatment of the data. We followed the three-stage approach used in Marcellino et al. (2003). First, the series are seasonally adjusted using the X-11 ARIMA procedure.3 Second, the series are transformed to account for stochastic and deterministic trends; logarithms are taken for all nonnegative series that are not already in rates or percentage units. Variables describing real economic activity are treated as I(1), whereas survey data are treated as I(0). All series are further standardized to have a zero sample mean and unit sample variance. Finally, series are screened for large outliers (outliers exceeding six times the inter-quartile range), and the outliers are replaced as missing data. The EM algorithm is used to estimate the factor model for the resulting unbalanced panel. 2 We wish to thank the IMAD experts for helping us identifying poor series. 3 Statistical package EViews was used for the seasonal adjustment. 6 3 FORECASTING PERFORMANCE This section presents a comparison of forecast performance of the models described in previous sections. The models and data set used for the purposes of this paper have been used in the periodic IMAD forecasting of macroeconomic variables.4 The focus is on the following series: gross domestic production (GDP), gross fixed capital formation (GFCF), private consumption (PCONS), imports (IMP), exports (EXP) and industrial production in manufacturing (IPSID). In the following text we present the best models for forecasting the observed variables for 3 different horizons. The best forecasting models are presented in Table 1. As already noted, the measure of performance of a model is the relative pseudo out-of-sample MSE (mean square error) compared to the AR model. The MSE relative to the benchmark AR model (rMSE, relative mean square error) for all models and horizons are reported in Statistical Appendix on the webside. Table 1: MSE relative to AR (rMSE) at 3 horizons Horizon GDP growth GFCF PCONS EXP IMP growth Ind. prod. growth growth growth growth Best model 0.15 nbp_ar_11 0.37 nbp_10 0.63 nbp_lag_03 0.87 nbp_ar_02 0.39 nbp_02 0.54 nbp_ar_02 (lag_1) (lag_1) (orig) (lag_3) (lag_2) (lag_1) Pooled 0.62 0.77 1.02 1.51 0.63 0.78 Pooled ic RMSE 0.84 0.89 1.09 1.85 0.79 1.48 0.005 0.030 0.004 0.018 0.033 0.016 AR Best model 0.10 nbp_icar_08 0.21 bp_ic_07 0.66 nbp_lag_04 0.70 bp_01 0.49 nbp_ar_11 0.34 nbp_arlag_04 (orig) (orig) (orig) (lag_1) (lag_3) (orig) Pooled 2 Pooled ic RMSE 0.45 0.74 2.06 0.98 0.96 0.72 0.21 0.55 3.43 2.55 2.77 0.79 0.008 0.047 0.006 0.031 0.039 0.025 AR Best model 0.07 nbp_ic_bic 0.39 bp_09 0.34 nbp_05 0.31 bp_04 0.19 bp_ar_08 0.11 bp_ar_09 (lag_2) (lag_2) (lag_2) (orig) (lag_1) (lag_2) Pooled 3 Pooled ic RMSE 0.58 0.87 0.77 0.41 0.98 0.41 0.32 0.89 1.92 1.12 3.33 0.49 0.012 0.065 0.008 0.040 0.034 0.040 AR Note: "Nbp" stands for a factor model with factors from a non-balanced panel, "bp" for factors from a balanced panel. "Ar" after "bp" or "nbp" marks the inclusion of the AR component (based on BIC selection), and "lag" denotes the inclusion of lagged factors. The number at the end stands for the number of factors in the model. "Bic" marks a model with BIC selection of factors. "Pooled" and "Pooled ic" stand for median without the inclusion of lags in factor extraction, and without and with intercept correction, respectively. "RMSE AR" is the absolute root mean square error of the benchmark AR model. Notations in brackets indicate the inclusion of additional series in the data set (orig-original series, lag_2-series with 1 and 2 lags, lag_3-series with up to 3 lags). Additionally, in order to graphically represent the performance of dynamic factor models, we present plots of forecasts for the best forecasting factor model under different strategies in computation of factors, i.e. with the original series and with the inclusion of lagged series (up to 3 lags) in the panel from which the factors are extracted. Each figure also contains the forecasts obtained with the benchmark AR model and the realization 4 Statistical software Gauss has been used for the modelling. 7 (the actual data) for each variable that we observed. For brevity, the main text contains only the figure for one-quarter-ahead forecasts, while the corresponding figures for horizons 2 and 3 are presented in Appendix A. Before turning to the forecasting performance of the factor model for each individual series, we briefly comment on several general observations that emerge from Table 1. First, for each series the factor models offer significant gains in forecasting precision relative to a simple AR model. This is reflected through the performance of the best models, but for some series also through pooled (median) factor forecasts. Second, the gains in forecasting precision increase with forecast horizon, which is especially important for practitioners whose forecasting horizon regularly exceeds one quarter. Finally, significant gains in forecasting precision are also observed for variables whose benchmark AR forecast is already relatively precise, e.g. GDP growth and growth of private consumption. Altogether, these results point to the usefulness of dynamic factor models for forecasting. 3.1 GDP growth When forecasting GDP one quarter ahead, the best model is the factor model with the inclusion of the AR component and 11 factors from the unbalanced panel with the data set including a lagged series up to one quarter. Its relative MSE to the AR model is 0.15. This model represents an 85% improvement over the AR model, while in comparison the same model with factors obtained without using a lagged series in factor estimation yields a 64% improvement over the AR model (see Statistical Appendix or Table 3 for details). We can also observe an increase in the rMSE through adding additional series with more lags. The reasoning for this would be that, by adding new series, the positive effect of additional information is overwhelmed by the effect of "oversampling" described by Boivin and Ng (2006).5 Figure 1: Forecast of GDP growth one quarter ahead for the period 2004Q3–2007Q1 0,025 0,02 0,015 0,01 0,005 2004Q3 2004Q4 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 At horizons 2 and 3, the factor models improve on the AR model even more, by as much as 90%. However, the inclusion of lagged series only slightly improves the forecasts only at the three-quarter-ahead horizon, 5 Schneider and Spitzer (2004) also observe significantly better results with smaller data subsets. 0 8 while at horizon 2 the factor model, with factors extracted from the original series, performs the best. The structure of the best models at horizons 2 and 3 is similar, namely, factors extracted from the unbalanced panel and with the use of intercept correction as a forecast-robustifying device. The performance of different models in forecasting GDP at horizon 1 is presented in Figure 1. It is reassuring to find that factor models are able to capture the turning points in the GDP growth rate rather well. Factor models also clearly outperform the AR model, which is unable to predict the upswing in growth at the end of the sample. 3.2 Growth of gross fixed capital formation Forecasting investments (growth of GFCF – gross fixed capital formation) with factor models clearly outperforms the AR models for all 3 horizons. As in the case of GDP, in the first horizon the inclusion of 1 set of lagged series is beneficial, while for horizon 2 and 3 the best "subsets" are the original set of series and 2 additional sets of lagged series respectively. For the one-quarter horizon, presented in Figure 2, the best factor model yields a 63% improvement over AR model performance. However, graphically the forecasts do not seem as "good", since forecasts for some periods predict incorrect signs of change in the growth rate, while for the other periods the predicted direction of change is correct but the magnitude is sometimes far from the observed values. Overall, this is also the case for the competing models, meaning that – as evident from Table 1 – the overall forecasting precision for aggregate investment in Slovenia is rather limited. Similarly to the case of GDP growth, however, factor models do not suffer from marked under-prediction at the end of the period as does the benchmark AR model. Such a result is expected, given that investment was one of the main driving forces behind the marked increase in growth rate of GDP in 2006 and 2007. Figure 2: Forecasting growth of gross fixed capital formation one quarter ahead for the period 2004Q3–2007Q1 0,1 0,08 0,06 0,04 0,02 0 -0,02 -0,04 „,-„,- ,....i:_.i;..., / -----------AR model afe"""—-¦-*' -~ >w "*'¦"—V V* ^s_ j^ j^S/ " \ *C —^ 2004Q3 2004Q4 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 9 3.3 Growth of private consumption In forecasting private consumption the positive effect of the inclusion of additional sets of lagged series in factor extraction is observed only at the horizon 3. While the best factor model from the original series yields a 51% improvement compared to the AR model, the inclusion of an additional series with two lags yields another 15-percentage-point improvement, lowering the rMSE to 0.34. At horizon 1 and 2, the inclusion of the lagged series in the factor extraction does not improve the rMSE. However, the model with the inclusion of 3 additional sets of lagged series performs well for the end of the sample at horizon 1 (Figure 3). It even forecasts the outlier in 2006Q2 correctly, not only in the direction of change but also in the magnitude of growth. Figure 3: Forecasting growth of private consumption one quarter ahead for the period 2004Q3–2007Q1 0,021 0,016 0,011 0,006 0,001 -0,004 ------------AR model -------------original series nbp_lag_03 rMSE(0,63) — ¦ — lagged series_1 bp_bic rMSE(0,69) — %• - lagged series_2 bp_bic rMSE(0,69) -----------lagged series_3 nbp_10 rMSE(0,64) 2004Q3 2004Q4 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 3.4 Growth of exports Factor models in forecasting exports for one-step-ahead forecasts outperform the AR model. The best factor model, using a dataset with 3 additional lags of variables, has an MSE relative to the AR model of 0.87. The gains in forecasting precision from using factor models increase with forecasting horizon, amounting to 69% for three quarters ahead. However, overall forecasting precision remains rather low. As evident from Figure 4, neither the AR model nor the factor models are able to predict the large drops and increases in export growth in the time interval. The same applies to forecasts at other horizons. 10 Figure 4: Forecasting growth of exports one quarter ahead for the period 2004Q3–2007Q1 0,07 0,06 0,05------ 0,04 0,03 0,02 0,01 ¦ EXP (realization) AR model ---------original series nbp_arlag_bic rMSE(0,95) - • — lagged seriesj bp_01 rMSE(0,92) -X- - lagged series_2 nbp_ar_01 rMSE(0,91) - — lagged series_3 nbp_ar_02 rMSE(0,87) 2004Q3 2004Q4 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 3.5 Growth of imports The factor models clearly outperform the AR model, by as much as 81% for horizon 3 (see Table 1). Generally, the inclusion of lagged series is beneficial. The interesting occurrence in forecasting imports is the relative forecasting strength of the model for forecasting in horizon 3. For horizon 1, on the other hand, even though the factor models outperform the AR model, no model is able to capture the turning points in the growth rate correctly.6 Figure 5: Forecasting growth of imports one quarter ahead for the period 2004Q3–2007Q1 0,09 0,07 0,05 0,03 0,01 -0,01 -0,03 -0,05 -------------IMP (realization) ------------AR model -------------original series bp_02 rMSE(0,4) — • — lagged seriesj nbp_02 rMSE(0,41) —x • - lagged series_2 nbp_02 rMSE(0,39) -----------lagged series_3 nbp_bic rMSE(0,42) 2004Q3 2004Q4 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 6 For forecasts for horizons 2 and 3 see Appendix A. 0 11 3.6 Growth of industrial output A very important variable for forecasting economic activity is the growth rate of industrial output (IPSID in Figure 6), because the Statistical Office publishes official data with smaller delays. The best model is the factor model with 2 factors from the non-balanced panel and inclusion of the AR component, with the extended set of lagged series in factor extraction (up to one lag included). The model represents an improvement over the AR, with a relative MSE of 0.54. As Figure 6 shows, the direction of change in the growth rate is incorrectly predicted only for 2006Q4 and 2005Q2. For other horizons, the factor models improve on the AR model by even more, up to 89% at the three-quarter horizon. In this case, extending the original dataset improves the quality of the forecasting performance of the factor models, while for the horizon 2 the best model is the model with factors extracted from the original series. Figure 6: Forecasting growth of industrial output one quarter ahead for the period 2004Q3–2007Q1 0,04 0,035 0,03 0,025 0,02 0,015 0,01 0,005 0 -0,005 -0,01 -i--------------------1--------------------1--------------------1--------------------1--------------------1--------------------1--------------------1--------------------1--------------------r- 2004Q3 2004Q4 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 3.7 The role of lagged variables in factor extraction It follows from the discussion above that in several cases the best forecasting model uses factors extracted from a dataset that also includes lagged values of the original series. In this respect it may be informative to investigate the role of adding lagged variables in factor extraction more formally. In Table 2 we present the median of forecasts obtained with factor models without intercept correction for all three horizons. Shaded cells correspond to cases where the inclusion of lags in factor extraction increase the average forecasting performance of the factor models. This occurs in more than half the cases, even though the gains are not very large. For GDP growth there are improvements for all three horizons, while for private consumption and industrial output growth the same is observed for two- and three-quarter horizons. Combined with the evidence of the best models in Table 2, this also suggests that the inclusion of lagged variables in factor extraction is an advisable approach in applied work. 12 Table 2: Median factor forecasts according to the number of lags included in the extraction of factors (rMSE relative to AR) No. of lags 0 1 2 3 h =1 GDP growth 0.62 0.48 0.53 0.58 Imports growth 0.63 0.54 0.53 0.49 Exports growth 1.51 1.27 1.34 1.29 GFCF growth 0.77 0.67 0.75 0.84 Priv. cons. growth 1.02 1.60 1.86 1.18 Ind. prod. growth 0.78 0.78 0.82 0.86 h = 2 GDP growth 0.45 0.39 0.45 0.53 Imports growth 0.96 1.03 0.96 0.97 Exports growth 0.98 1.00 1.02 1.02 GFCF growth 0.74 0.74 0.94 1.08 Priv. cons. growth 2.06 1.82 1.42 1.28 Ind. prod. growth 0.72 0.60 0.55 0.53 h = 3 GDP growth 0.58 0.43 0.47 0.53 Imports growth 0.98 0.91 0.95 0.99 Exports growth 0.41 0.44 0.58 0.60 GFCF growth 0.87 0.99 1.13 1.28 Priv. cons. growth 0.77 0.72 0.56 0.79 Ind. prod. growth 0.41 0.23 0.23 0.22 3.8 The role of preselection of variables In an empirical application we considered an additional technique with which we aimed to enhance the information content of factors. As described in Section 2.2, we also tested preselection of variables, as proposed by Boivin and Ng (2006). For brevity, the results in Table 3 are only given for one-step-ahead forecasts7. Table 3: MSE of the best factor models relative to AR (rMSE) with and without preselection of variables at a one-quarter-ahead forecast horizon GDP growth GFCF growth PCONS growth EXP growth IMP growth IPSID growth Without preselection With preselection 0.36 nbp_icar_05 0.41 bp_icar_01 0.51 bp_07 0.79 bp_ic_ar_bic 0.63 nbp_lag_03 1.01 bp_ic_lag_01 0.95 bp_02 0.69 nbp_03 0.40 bp_02 0.31 bp_04 0.58 nbp_ar_04 0.37 nbp_ar_04 Note: see notes to Table 1. 7 See also Statistical Appendix on the webside. 13 In half of the cases – growth of imports and exports and industrial production – preselection offers improvements in forecasting accuracy. The gains are quite large for exports and industrial output, on an order of magnitude of around 20 percentage points relative to the benchmark AR model. The result becomes even stronger if compared to the figures in Table 1. We can observe that the best factor models with preselection of variables at the stage of factor extraction are also the best models overall for the three above-mentioned variables. 14 4 CONCLUDING DISCUSSION The literature offers many applications of dynamic factor models in forecasting macroeconomic variables. Very few of them deal with short time series and transition countries that have witnessed immense structural changes in the process of transition. For such countries, the length of the time series of quarterly data does not exceed 50 observations. In such circumstances researchers face significant difficulties in obtaining robust model estimates and evaluating the forecasting performance of competing models in a pseudo real-time context. Relying on simple time series models, such as autoregressions, often seems the only choice. Thus, having more complex but robust and viable forecasting models is especially important under such conditions. To a certain extent, factor models allow us to compensate the shortness of time series by exploiting the cross-section dimension – a large number of different macroeconomic variables are readily available from public sources, even for transition economies. To date, only the application of Banerjee et al. (2005, 2006) has been documented in the literature. In this paper we focus on Slovenia and extend their approach to a wider coverage of variables and use a richer data set for factor extraction. In addition, we consider two technical modifications of the factor estimation procedure. The first is the data preselection proposed by Bovin and Ng (2006), and second the inclusion of lagged variables in the panel data set from which the factors are extracted. The application of both methods in the context of short time series is new. Evaluation of their merit in a case where the length of the time series is short is thus even more important. Dynamic factor models potentially offer large gains in forecasting precision relative to simple AR models. Moreover, their comparative advantage generally increases with forecast horizon. This characteristic is especially important for policymakers, whose forecast horizon is never very short. Preselecting variables and including lags in the factor estimation stage also produced positive results. Gains in forecasting precision with these two modifications appear to be important in cases where classic dynamic factor models perform less dominantly. Altogether, the results of this paper support the usefulness of dynamic factor models in applied forecasting in situations where a lack of data does not permit the application of complex large-scale structural models. 15 BIBLIOGRAPHY 1. Altissimo, F., Bassanetti, A., Cristadoro, R., Forni, M., Hallin, M., Lippi, M., Reichlin, L. (2001). EuroCOIN: a real time coincident indicator of the euro area business cycle. CEPR Working Paper, No. 3108. 2. Artis, M., Banerjee, A., Marcellino, M. (2005). Factor forecasts for the UK. Journal of Forecasting, 24, 279-298. 3. Bai, J. (2003). Inferential theory for factor models of large dimensions. Econometrica 71: 135-171. 4. Bai, J., Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica 70: 191-221. 5. Bai, J., Ng, S. (2007). Determining the number of primitive shocks in factor models. Journal of Business & Economic Statistics, American Statistical Association, vol. 25: 52-60. 6. Banerjee, A., Marcellino, M. (2003). Are there any reliable leading indicators for US inflation and GDP growth?. IGIR Working Paper, No. 236. 7. Banerjee, A., Marcellino, M., Masten, I. (2003). Leading indicators for Euro area inflation and GDP growth. CEPR Working Paper, No. 3893. 8. Banerjee, A., Marcellino, M., Masten, I. (2005) “Forecasting Macroeconomic Variables for the Accession Countries”, ECB Working Paper No. 482, May 2005. 9. Banerjee, A., Marcellino, M., Masten, I. (2006). Forecasting Macroeconomic Variables for the New Member states in Artis, M, Banerjee, A. and Marcellino, M. (eds.), The Central and Eastern European Countries and the European Union, Cambridge: Cambridge University Press. 10. Banerjee, A., Marcellino, M., Masten, I. (2008). Forecasting Macroeconomic Variables Using Diffusion Indexes in Short Samples with Structural Change in M. Wohar and D. Rapach, Eds.: Handbook of Forecasting in Presence of Structural Change and Model Uncertainty, Elsevier, forthcoming. 11. Bernanke, B. S., Boivin, J., Eliasz, P. (2005). Measuring the effects of monetary policy: A factor-augmented vector autoregressive (FAVAR) approach. Quarterly Journal of Economics 120: 387-422. 12. Bernanke, B.S., Boivin, J. (2003). Monetary policy in a data-rich environment. Journal of Monetary Economics 50: 525-546. 13. Bovin, J., Ng, S. (2005). Understanding and Comparing Factor-Based Forecasts. International Journal of Central Banking, International Journal of Central Banking, vol. 1(3), December. 14. Bovin, J., Ng, S. (2006). Are more data always better for factor analysis? Journal of Econometrics, Elsevier, vol. 127(1), pages 169-194 15. Breitung, J. (2005). Estimation and inference in dynamic factor models. University of Bonn, mimeo. 16. Breitung, J., Kretschmer, U. (2005). Identification and estimation of dynamic factors from large macroeconomic panels. Universitat Bonn, mimeo. 17. Brillinger, D. R. (1981). Time Series Data Analysis and Theory. Holt, Rinehart and Winston, New York. 18. Brisson, M., B. Campbell and J.W. Galbraith (2001), “Forecasting some low predictability time series using diffusion indices”, CIRANO Working Papers, No. 2001s-46. 16 19. Camba-Mendez, G., Kapetanios, G. (2004). Forecasting euro area inflation using dynamic factor measures of underlying inflation. ECB Working Paper, No. 402. 20. Catell, R.B. (1966). The Scree test for the number of factors. Multivariate Behavioral Research 1: 245-276. 21. Chamberlain, G., Rothschild, M. (2003). Arbitrage, factor structure and mean- variance analysis in large asset markets. Econometrica 51: 1305-1324. 22. Cimadomo, J. (2003). The effects of systematic monetary policy on sectors: a factor model analysis. ECARES -Universite Libre de Bruxelles, mimeo. 23. Cristadoro, R., Forni, M., Reichlin, L., Veronese, G. (2001). A Core Inflation Index for the Euro Area. CEPR Discussion Paper, No. 3097. 24. Eickmeier, S. (2004). Business cycle transmission from the US to Germany - a structural factor approach. Bundesbank Discussion Paper, No. 12/2004, revised version. 25. Eickmeier, S. (2005). Common stationary and non-stationary factors in the euro area analyzed in a large-scale factor model. Bundesbank Discussion Paper, No. 2/2005. 26. Eickmeier, S., Breitung, J. (2005). How synchronized are central and east European economies with the Euro area? Evidence from a structural factor model. Bundesbank Discussion Paper, No. 20/2005. 27. Fagan, G., Henry, J., Mestre, R. (2001). An area wide model (AWM) for the Euro area. ECB Working Paper, No. 42. 28. Favero, C., Marcellino, M., Neglia, F. (2005). Principal components at work: the empirical analysis of monetary policy with large datasets. Journal of Applied Econometrics 20: 603-620. 29. Figlewski, S. (1983), “Optimal price forecasting using survey data”, Review of Economics and Statistics 65: 813– 836. 30. Figlewski, S., and T. Urich (1983), “Optimal aggregation of money supply forecasts: accuracy, profitability and market efficiency”, The Journal of Finance 28: 695–710. 31. Forni M., Hallin, M., Lippi, F., Reichlin, L. (2003). Do Financial Variables Help Forecasting Inflation and Real Activity in the Euro Area?. Journal of Monetary Economics 50: 1243-1255. 32. Forni M., Hallin, M., Lippi, F., Reichlin, L. (2005). The generalized dynamic factor model: one-sided estimation and forecasting. Journal of the American Statistical Association 100: 830-840. 33. Forni, M., Giannone, D. Lippi, F., Reichlin, L. (2004). Opening the Black Box: Structural Factor Models versus Structural VARS. Universite Libre de Bruxelles, mimeo. 34. Forni, M., Hallin, M., Lippi, F., Reichlin L. (2000). The Generalized Dynamic Factor Model: Identification and Estimation. Review of Economics and Statistics 82: 540-554. 35. Forni, M., Hallin, M., Lippi, F., Reichlin L. (2002). The generalized dynamic factor model: consistency and convergence rates. Journal of Econometrics 82: 540-554. 36. Giannone, D., Sala, L., Reichlin, L. (2002). Tracking Greenspan: systematic and unsystematic monetary policy revisited. ECARES-ULB, mimeo. 37. Giannone, D., Sala, L., Reichlin, L. (2004). Monetary policy in real time. forth- coming in: Gertler, M., K. Rogo® (eds.) NBER Macroeconomics Annual, MIT Press. 17 38. Hannson, J., Jansson, P., Loef, M. (2005). Business survey data: Do they help in forecasting GDP growth? International Journal of Forecasting 21: 377-399. 39. Helbling, T., Bayoumi, T. (2003). Are they all in the same boat? The 2000-2001 growth slowdown and the G7-business cycle linkages. IMF Working Paper, WP/03/46. 40. Jagrič, T. (2003). A Nonlinear Approach to Forecasting with Leading Economic Indicators. Studies in Nonlinear Dynamics and Econometrics, 7, pp. 0 - 19. 41. Jaoreskog, K.G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrica 34: 183-202. 42. Jimenez-Rodriguez, M., M. Sanchez (2005). Oil price shocks and real GDP growth: empirical evidence for some OECD countries. Applied Economics 37: 201-228. 43. Kapetanios, G. (2004). A Note on Modelling Core Inflation for the UK Using a New Dynamic Factor Estimation Method and a Large Disaggregated Price Index Dataset. Economics Letters 85: 63-69. 44. Kapetanios, G., Marcellino, M. (2003). A comparison of estimation methods for dynamic factor models of large dimensions. Queen Mary University of London, Working Paper No. 489. 45. Korhonen, I. (2003). Some empirical tests on the integration of economic activity between the Euro area and the accession countries: a note. Economics of Transition 11: 1-20. 46. Malek Mansour, J. (2003). Do national business cycles have an international origin? Empirical Economics 28: 223-247. 47. Marcellino, M., Stock, J.H., Watson, M.W. (2000). A dynamic factor analysis of the EMU. IGER Bocconi, mimeo. 48. Marcellino, M., Stock, J.H., Watson, M.W. (2003). Macroeconomic forecasting in the Euro area: country-specific versus Euro wide information. European Economic Review 47: 1-18. 49. Onatski, A. (2005). Determining the number of factors from the empirical distribution of eigenvalues. Economics Discussion Paper Series, Columbia University, 0405-19. 50. Peersman, G. (2005). What caused the early millenium slowdown? Evidence based on vector autoregressions. Journal of Applied Econometrics 20: 185-207. 51. Reijer den, A.H.J. (2005). Forecasting Dutch GDP using large scale factor models. DNB Working Paper, No. 28. 52. Sala, L. (2003). Monetary policy transmission in the Euro area: a factor model approach. IGER Bocconi, mimeo. 53. Schneider, M. and M. Spitzer (2004): Forecasting Austrian GDP using the generalized dynamic factor model. Working paper / Oesterreichische Nationalbank No. 89. 54. Schumacher, C. (2005). Forecasting German GDP using alternative factor models based on large datasets. Bundesbank Discussion Paper No. 24/2005. 55. Schumacher, C, Dreger, C. (2004). Estimating large-scale factor models for economic activity in Germany: do they outperform simpler models?. Jahrbuecher fuer Nationaloekonomie und Statistik 224: 731-750. 56. Stock, J. H., Watson, M.W. (1999). Forecasting inflation. Journal of Monetary Economics 44: 293-335. 57. Stock, J.H., Watson, M.W. (2002a). Macroeconomic forecasting using diffusion indexes. Journal of Business & Economic Statistics 20: 147-162. 18 58. Stock, J. H., Watson, M.W. (2002b). Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association 97: 1167-1179. 59. Stock, J. H., Watson, M.W. (2005). Implications of dynamic factor models for VAR analysis. NBER Working Paper No. W11467. 19 APPENDIX A: IN-SAMPLE FORECASTS FOR 2- AND 3-QUARTER HORIZONS Figures A.1 to A.12: Forecasts of the best factor models and benchmark AR for horizons 2 and 3 of the period 2004Q4 (2005Q1 for 3-quarter-ahead forecasts) to 2007Q1, quarterly data. The forecast results are compared to the actual realization. Figure A.1: GDP growth, 2 quarters ahead 0,045 0,04 0,035 0,03 0,025 0,02 0,015 0,01 0,005 0 -----------GDP (realization) -----------AR model -----------original series nbp_icar_08 rMSE(0,1) — • — lagged series_1 nbp_10 rMSE(0,1) —x • - lagged series_2 nbp_ar_11 rMSE(0,14) ----------lagged series_3 nbp_ic_ar_08 rMSE(0,19) ^¦» ——— ^^ ^.-^ 2004Q4 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 Figure A.2: GDP growth, 3 quarters ahead 0,06 0,05 0,04 0,03 0,02 0,01 ----------GDP (realization) ---------AR model ----------original series nbp_ic_08 rMSE(0,15) — • — lagged series_1 nbp_ic_09 rMSE(0,08) —x- - lagged series_2 nbp_ic_bic rMSE(0,07) --------lagged series_3 nbp_ic_ar_11 rMSE(0,2) 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 0 20 Figure A.3: Growth of gross fixed capital formation, 2 quarters ahead 0,12 0,1 0,08 0,06 0,04 0,02 0 -0,02 -0,04 ¦ GFCF (realization) AR model --------original series bp_ic_07 rMSE(0,21) - • — lagged series_1 bp_ic_09 rMSE(0,23) -X- - lagged series_2 bp_ic_09 rMSE(0,26) - — lagged series_3 bp_ic_09 rMSE(0,26) 2004Q4 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 Figure A.4: Growth of gross fixed capital formation, 3 quarters ahead 0,18 0,16 0,14 0,12 0,1 0,08 0,06 0,04 0,02 0 -0,02 -0,04 ¦ GFCF (realization) -----------AR model -----------original series bp_ar_11 rMSE(0,46) — ¦ — lagged series_1 bp_09 rMSE(0,44) —X ¦ - lagged series_2 bp_09 rMSE(0,39) ----------lagged series_3 bp_09 rMSE(0,39) -^^ 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 21 Figure A.5: Growth of private consumption, 2 quarters ahead 0,045 0,04 0,035 0,03 0,025 0,02 0,015 0,01 0,005 0 — X- ¦ PCONS (realization) AR model ¦ original series nbp_lag_04 rMSE(0,66) ¦ lagged series_1 bp_01 rMSE(0,83) ¦ lagged series_2 nbp_05 rMSE(0,76) lagged series_3 nbp_04 rMSE(0,84) -*» 2004Q4 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 Figure A.6: Growth of private consumption, 2 quarters ahead 0,05 0,045 0,04 0,035 0,03 0,025 0,02 0,015 0,01 0,005 0 -----------PCONS (realization) -----------AR model -----------original series nbp_03 rMSE(0,49) — • — lagged series_1 bp_ar_06 rMSE(0,58) —X • - lagged series_2 nbp_05 rMSE(0,34) ----------lagged series_3 bp_ar_07 rMSE(0,53) -*¦ X^ 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 22 Figure A.7: Growth of imports, 2 quarters ahead 0,15 0,13 0,11 0,09 0,07 0,05 0,03 0,01 -0,01 -0,03 -----------IMP (realization) -----------AR model -----------original series bp_ar_07 rMSE(0,66) — ¦ — lagged series_1 bp_ar_09 rMSE(0,53) —x ¦ - lagged series_2 nbp_ar_10 rMSE(0,6) ----------lagged series_3 nbp_ar_11 rMSE(0,49) 2004Q4 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 Figure A.8: Growth of imports, 2 quarters ahead 0,2 0,18 0,16 0,14 0,12 0,1 0,08 0,06 0,04 0,02 0 0,14----- --------IMP (realization) --------AR model --------original series bp_ar_02 rMSE(0,22) - ¦ — lagged series_1 bp_ar_08 rMSE(0,19) -* ¦ - lagged series_2 nbp_ar_12 rMSE(0,2) - — lagged series_3 nbp_ar_08 rMSE(0,24) 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 23 Figure A.9: Growth of exports, 2 quarters ahead 0,12 0,1 0,08 0,06 0,04 0,02 ------------AR model ¦ original series bp_01 rMSE(0,71) - • — lagged seriesj bp_01 rMSE(0,7) -X ¦ - lagged series_2 nbp_ar_01 rMSE(0,73) - — lagged series_3 nbp_ar_02 rMSE(0,7) 2004Q4 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 Figure A.10: Growth of exports, 3 quarters ahead 0,16 0,14 0,12 0,1 0,08 0,06 0,04 0,02 0 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 24 ---------AR model ---------original series nbp_arlag_04 rMSE(0,34) - • — lagged seriesj nbp_02 rMSE(0,37) -X ¦ - lagged series_2 nbp_ar_12 rMSE(0,34) - — lagged series_3 nbp_08 rMSE(0,39) Figure A.11: Industrial production growth, 2 quarters ahead 0,07 0,06 0,05 0,04 0,03 0,02 0,01 0 -- 2004Q4 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 Figure A.12: Industrial production growth, 3 quarters ahead 0,1 0,09 0,08 0,07 0,06 0,05 0,04 0,03 0,02 0,01 0 ------------AR model -------------original series bp_arlag_03 rMSE(0,14) — • — lagged seriesj bp_ar_09 rMSE(0,15) —x • - lagged series_2 bp_ar_09 rMSE(0,11) -----------lagged series_3 nbp_ar_08 rMSE(0,11) 2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 25 APPENDIX B: DATA SET Table A. 1: List of series used in the quarterly data series transformation description group dabssa 5 Persons in employment, business services employment dagradsa 5 Persons in employment, construction employment daindsa 5 Persons in employment, industry employment dajssa 5 Persons in employment, public services employment dakmetsa 5 Persons in employment, agriculture and fishing employment strbpsa 5 Registered unemployment rate employment retczpsa 5 Real effective exchange rate, deflated by consumer prices exchange rate bpldrsa 5 Gross wages per worker, real; manufacturing income bplrsa 5 Gross wages per worker, real income obrmdt30 2 Interest rates short-term loan – overnight interbank average interest rate interest rates tlongca 2 Interest rates long-term loan – for capital assets interest rates tlongpop 2 Interest rates long-term loan – for housing interest rates tshortcc 2 Interest rates short-term loan – consumer credit interest rates ecbrf 2 Official refinancing operation rates, Central Bank international interest rates matezsa 5 Government bond yields, 10-year maturity – Monthly data, Eurozone international interest rates eurusdsa 5 ECB reference exchange rate, US dollar / euro international other gdpeu25sa 5 GDP EU25 international output gdpussa 5 GDP USA international output ipezmsa 5 Industrial production, Eurozone13 (Manufacturing) international output ipezsa 5 Industrial production, Eurozone international output hcpidesa 5 Harmonized consumer price index, Germany international prices hcpieusa 5 Harmonized consumer price index, EU international prices ppidesa 5 Production price index, Germany international prices ppieusa 5 Production price index, EU international prices ppimdesa 5 Production price index, Germany (Manufacturing) international prices ppimeusa 5 Production price index, Eurozone 13 (Manufacturing) international prices ccezsa 1 Consumer confidence, Eurozone international survey ccdesa 1 Consumer confidence, Germany international survey icdesa 1 Industrial confidence, Germany international survey icezsa 1 Industrial confidence, Eurozone international survey ifoegesa 1 IFO Business Expectations for Germany - Trade and Industry (R3) international survey ifosgesa 1 IFO Business Situation for Germany - Trade and Industry (R2) international survey zewgesa 1 ZEW Indicator of Economic Sentiment for Germany - expectations international survey exeu25sa 5 Exports EU25 international trade exussa 5 Exports USA international trade imeu25sa 5 Imports Eurozone international trade imussa 5 Imports USA international trade exposa 5 Exports international trade Slovenia Im psa 5 Imports international trade Slovenia gfcfsa 5 Fixed investments other pconssa 5 Private consumption other trgdebs 5 Nominal turnover of wholesale trade, deflated by CPI other trgdros 5 Volume of retail trade and motor trade turnover other dinvsa 1 Changes in inventories output gdpsa 5 GDP output gradsa 5 Value of construction put in place output ipsicsa 5 Industrial production, mining output ipsidsa 5 Industrial production, manufacturing output 26 series transformation description group ipsiesa 5 Industrial production, electricity output cpindsa 5 Consumer price index, end of quarter prices cpipsa 5 Consumer price index of goods prices cpisa 5 Consumer price index prices cpissa 5 Consumer price index, services prices infnhnesa 5 Core inflation prices ppiconsa 5 Production price index, consumer goods industries prices ppiintsa 5 Production price index, intermediate goods industries prices ppiinvsa 5 Production price index, investment goods industries prices ppisa 5 Production price index prices prcenesa 5 Non-regulated prices prices recenesa 5 Regulated prices prices ccsisa 1 Consumer Confidence, Slovenia survey icsisa 1 Industrial Confidence, Slovenia survey rtcsisa 1 Retail trade confidence - Slovenia survey 27