CONTENTS Metodoloski zvezki, Vol. 11, No. 2, 2014 Wararit Panichkitkosolkul Confidence Interval for the Process Capability Index Cp Based on the Bootstrap-t Confidence Interval for the Standard Deviation 79 Antonio A. Romano and Giuseppe Scandurra Investments in Renewable Energy Sources in OPEC Members: a Dynamic Panel Approach 93 Pavol Kral, Lukas Sobisek and Maria Stachova A Distance Based Measure of Data Quality 107 Metodološki zvezki, Vol. 11, 2014 Reviewers for Volume Eleven Rok Blagus Andrej Blejec Lea Bregar Matevž Bren Germa Coenders Llms Coromina Patrick Doreian Anuška Ferligoj Herwig Friedl Georg Heinze Nataša Kejžar Katarina Košmelj Irena Krizman Nada Lavrač Giovanni Millo Irena Ograjenšek Jože Rovan Janez Stare Damjan Škulj Aleš Toman Vasja Vehovar Gaj Vidmar Anja Žnidaršič Aleš Žiberna CONTENTS Metodoloski zvezki, Vol. 11, No. 79, 2014 Confidence Interval for the Process Capability Index Cp Based on the Bootstrap-/ Confidence Interval for the Standard Deviation Wararit Panichkitkosolkul1 Abstract This paper proposes a confidence interval for the process capability index based on the bootstrap-/ confidence interval for the standard deviation. A Monte Carlo simulation study was conducted to compare the performance of the proposed confidence interval with the existing confidence interval based on the confidence interval for the standard deviation. Simulation results show that the proposed confidence interval performs well in terms of coverage probability in case of more skewed distributions. On the other hand, the existing confidence interval has a coverage probability close to the nominal level for symmetrical or less skewed distributions. The code to estimate the confidence interval in R language is provided. 1 Introduction Statistical process quality control has been widely applied in many industries. One of the quality measurement tools used for improvement of quality and productivity is the process capability index (PCI). Process capability indices are practical tools for establishing the relationship between the actual process performance and the manufacturing specifications. Although there are many process capability indices, the most commonly used index is Cp (Kane, 1986; Zhang, 2010). In this paper, we focus on the process capability index Cp, defined by Kane (1986) as: C _ USL - LSL (1) p _ 6C ' ( } where USL is the upper specification limit, LSL is the lower specification limit, and c is the process standard deviation. The numerator of Cp gives the size of the 1 Department of Mathematics and Statistics, Faculty of Science and Technology, Thammasat University, Thailand; wararit@mathstat.sci.tu.ac.th 80 Wararit Panichkitkosolkul range over which the process measurements can vary. The denominator gives the size of the range over which the process actually varies (Kotz and Lovelace, 1998). Due to the fact that the process standard deviation is unknown, it must be estimated from the sample data (X1,..., Xn}. The sample standard deviation S; / n1/2 I n I S = (n -1)-1 ^(X -X)2 is used to estimate the unknown parameter a in V i=i / Equation (1). The estimator of the process capability index Cp is therefore C = USL - LSL p 6S " 1 J Although the point estimator of the capability index Cp shown in Equation (2) can be a useful measure, the confidence interval is more useful. A confidence interval provides much more information about the population characteristic of interest than does a point estimate (e.g., Smitson, 2001; Thompson, 2002; Steiger, 2004). The confidence interval for the capability index Cp is constructed by using a pivotal quantity Q = (n -1)S2/a2 ~ ^(2n-1). Therefore, the (1 -a)100°% confidence interval for the capability index Cp is ( IX „IX (3) f^ I Xa/2,n-1 A l X1-a/2,n- v where Xin^-i and x2-a/2,n-1 are the (a/2)100th and (1 -a/2)100th percentiles of the central chi-square distribution with n -1 degrees of freedom. The confidence interval for the process capability index Cp shown in Equation (3) is to be used for data that are normal. The coverage probability of this confidence interval is close to a nominal value of 1 -a when the data are normally distributed. However, the underlying process distributions are non-normal in many industrial processes. (e.g., Chen and Pearn, 1997; Bittanti et al., 1998; Wu et al., 1999; Chang et al., 2002; Ding, 2004). In these cases, the coverage probability of the confidence interval can be appreciably below 1 -a. Cojbasic and Tomovic (2007) presented a nonparametric confidence interval for the population variance based on ordinary t-statistics combined with the bootstrap method for a skewed distribution. In this paper, we propose a new confidence interval for the process capability index Cp based on the bootstrap-t confidence interval proposed by Cojbasic and Tomovic (2007). The paper is organized as follows. In Section 2, the theoretical background of the existing confidence interval for the Cp is discussed. In Section 3, we provide an analytical formula for the confidence interval for the Cp based on the bootstrap-t confidence interval for the standard deviation. In Section 4, the performance of the confidence intervals for the Cp are investigated through a Monte Carlo simulation study. Conclusions are provided in the final section. Confidence Interval for the Process Capability Index 81 2 Existing confidence interval for the process capability index Suppose Xt ~N),i = 1,2,...,n, a well-known (1 -a)100% confidence interval for the population variance a2, using a pivotal quantity Q = (n-1)S2/a2, is (Cojbasic and Loncar 2011) X>SL CI, = ^ a/2,n-1 A X 1-a/2,n-1 (5) p p p 3 Proposed confidence interval for the process capability index The bootstrap introduced by Efron (1979) is a computer-based and resampling method for assigning measures of accuracy to statistical estimates (Efron and Tibshirani, 1993). For a sequence of independent and identically distributed (i.i.d.) random variables, the bootstrap procedure can be defined as follows (Tosasukul et al., 2009). Let Xj,X2,...,Xn be independently and identically distributed random 82 Wararit Panichkitkosolkul variables from some distribution with mean / and variance a2. Let the random variables (X*,1 < j < m} be the result of sampling m times with replacement from the n observations X1,X2,...,Xn. The random variables (X**,1 < j < m} are called the bootstrap samples from original data X1, X2,..., Xn. A confidence interval for the population variance can be constructed using the aforementioned pivotal quantity Q = (n- 1)S2/a2. For large sample sizes, a central chi-square distribution with n-1 degrees of freedom can be approximated by a normal distribution with mean n -1 and variance 2(n -1) (Cojbasic and Tomovic, 2007). Therefore, the distribution of the standardized variable (n -1) S2 z = -_a2_ - ( n —1) C2 —.2 S — a a/2( n -1) yjvar(S2) converges to a standardized normal distribution as n increases to infinity. The bootstrap confidence interval for the a2 is calculated based on the statistic S -a T = VVârCS2) where var(S2) is a consistent estimator of the variance of S2. Casella and Berger (2001) have shown the estimator of var(S2) for a non-normal distribution such that Var(S2) =1 / — n-3 S4 n V n — 1 1 and & =-£ (Xt — X )4. After re-sampling B bootstrap samples, in each bootstrap sample we compute the value of the following statistic S *2 - S2 T * = VVar(S *2) where S *2 is a bootstrap replication of statistic S2 Var(S *2) = 1 i /i* — 3 S n V n — 1 (6) and jïA =—^(X* — X*)4. The (1 — a)100% bootstrap-t confidence intervals for the a2 is m ( S 2^2( n — 1) S 2^2( n — 1) 2t*a/2 2(n -1) ' 2a +yl2(n -1) where t'a/2 and fj*-a/2 are the (a/2)100th and (1 -a/2)100h percentiles of T* shown in Equation (6). Additionally, the (1 -a)100% confidence interval for the standard deviation a is (v i i- i1/2 r 7 /- i1/2 ^ (7) 2 i— S V2( n — 1) -V2( n — 1) 21* S V2(n — 1) -V2( n — 1) Then, from Equation (7), we construct the confidence interval for the Cp based on the bootstrap-t confidence interval for the standard deviation which is Confidence Interval for the Process Capability Index 83 S V2( n -1) 2Ca +V2(n -1) S V2(n -1) 2t'an +J2(n -1) < a < 1 < — < a S V2(n -1) 2 a +J 2( n -1) S ^2(n -1) 2t\-a/2 +V 2( n -1) 1/2 \ / -1/2 \ = 1 -a = 1 -a LSI - ISI USI - ISI ' S V2( n -1) 2fa*/2 W2(n -1) S V2( n -1) 2 fa +V 2( n -1) LSI - ISI LSI - ISI <-<-- 6a 6 S V2(n -1) 2f*-a/2 +V 2( n -1) -1/2 \ < CP < USI - ISI 2t, S V2( n -1) ^2( n -1) -1/2 A = 1 -a = 1 -a. Therefore, the confidence interval for the Cp based on the bootstrap-? confidence interval for the standard deviation is given by CI2 = USI - ISI S V2(n -1) 2a W2(n -1) USI - ISI 2 f S V2(n -1) + V2(n -1) -1/2 N (8) All confidence intervals were implemented using the open source statistical package R (Ihaka and Gentleman, 1996); source code is available in Appendix. P P P P 4 Simulation study To assess the performance of the proposed confidence interval, we conducted a Monte Carlo simulation study to estimate the coverage probabilities and expected lengths of the proposed confidence interval under different situations and compare them with the existing confidence intervals. The estimated coverage probability and the expected length (based on M replicates) are given by - #(L < Cp < U) 1 -a =---, M and M Z (Uj - h) Length = --, M where #(L < Cp < U) denotes the number of simulation runs for which the true process capability index Cp lies within the confidence interval. The right-skewed data were generated with the population mean /d = 50 and the population standard deviation a = 1 given in the Table 1. 84 Wararit Panichkitkosolkul Table 1: Probability distributions generated and the coefficient of skewness for Monte Carlo simulation. Probability Distributions Coefficient of Skewness N (50,1) 0.000 Uniform(48.268, 51.732) 0.000 10 x Beta(4.4375,13.3125) + 47.5 0.506 Gamma(9,3) + 47 0.667 Gamma(4,2) + 48 1.000 Gamma(2.25,1.5) + 48.5 1.333 Gamma(1,1) + 49 2.000 Gamma(0.75,0.867) + 49.1340 2.309 Gamma(0.5,0.707) + 49.2929 2.828 Gamma(0.4,0.6325) + 49.3675 3.163 Gamma(0.3,0.5477) + 49.4523 3.651 Gamma(0.25,0.5) + 49.5 4.000 The true values of the process capability index Cp, LSL and USL are set in the Table 2. Table 2: True values of Cp, LSL and USL used for Monte Carlo simulation. True Values of Cp LSL USL 1.00 47.00 53.00 1.33 46.01 53.99 1.50 45.50 54.50 1.67 44.99 55.01 2.00 44.00 56.00 The sample sizes simulated were 10, 25, 50 and 100 and the number of simulation trials was set to 10,000. The number of bootstrap samples is 1,000. The nominal confidence level was fixed at 0.95. All simulations were performed using programs written in the open source statistical package R (Ihaka and Gentleman, 1996). The simulation results are presented for four cases. As can be seen from Figures 1 and 2, the existing confidence interval ( CI1 ) provides more estimated coverage probabilities than the proposed confidence interval ( CI2 ) when the data were generated from symmetrical and less skewed distributions (coefficient of skewness between 0 and 2) for all sample sizes. Namely, CI1 provides estimated coverage probabilities close to the nominal level 0.95, which is more than those of the CI2 for the normal distribution. In addition, the expected lengths of CI2 were shorter than those of CI1 for all sample sizes (see Figures 5 and 6). Confidence Interval for the Process Capability Index 85 On the other hand, for more skewed distributions (coefficient of skewness between 2.309 and 4), the estimated coverage probabilities of CI2 were greater than those of CI1 for almost all sample sizes as shown in Figures 3 and 4. Figures 7 and 8 present the results on the expected lengths of CI1 and CI2 in case of right-skewed distributions. We found that the expected lengths of CI1 were shorter than those of CI2 for all sample sizes. n = 10 n = 25 1.0 1.2 1.4 1.6 1.8 2.0 Cp i-1-1-1-1-r 1.0 1.2 1.4 1.6 1.8 2.0 Cp n = 50 n = 100 1.0 1.2 1.4 1.6 1.8 2.0 Cp 1.0 1.2 1.4 1.6 1.8 2.0 Cp Figure 1: The estimated coverage probabilities of CI1 and CI2 for Cp in case of N(50,1) 86 Wararit Panichkitkosolkul n = 10 n = 25 i-1-1-1-r 1.0 1.2 1.4 1.6 1.8 2.0 1.0 1.2 1.4 1.6 1.8 2.0 Cp Cp n = 50 n = 100 i-1-1—r 1.0 1.2 1.4 1.6 1.8 2.0 1.0 1.2 1.4 1.6 1.8 2.0 Cp Cp Figure 2: The estimated coverage probabilities of CI1 and CI2 for Cp in case of Gamma(4,2) + 48 n = 10 n = 25 n-1-1-1-1-r 1.0 1.2 1.4 1.6 1.8 2.0 Cp 1-1-1-1-1-r 1.0 1.2 1.4 1.6 1.8 2.0 Cp n = 50 n = 100 n-1-1-1-1—r 1.0 1.2 1.4 1.6 1.8 2.0 Cp n-1-1-1—i-r 1.0 1.2 1.4 1.6 1.8 2.0 Cp Figure 3: The estimated coverage probabilities of CI1 and CI2 for Cp in case of Gamma(0.75,0.867) + 49.1340 Confidence Interval for the Process Capability Index 87 n = 10 n = 25 n-1-1-1-1-r 1.0 1.2 1.4 1.6 1.8 2.0 Cp 1.0 1.2 1.4 1.6 1.8 2.0 Cp n = 50 n = 100 n-1-1-1-1—r 1.0 1.2 1.4 1.6 1.8 2.0 Cp n-1-1-1—I-r 1.0 1.2 1.4 1.6 1.8 2.0 Cp Figure 4: The estimated coverage probabilities of CI1 and CI2 for Cp in case of Gamma(0.25,0.5) + 49.5 n = 10 n = 25 1.0 1.2 1.4 1.6 1.8 2.0 1.0 1.2 1.4 1.6 1.8 2.0 Cp Cp n = 50 n = 100 1.0 1.2 1.4 1.6 1.8 2.0 Cp 1.0 1.2 1.4 1.6 1.8 2.0 Cp Figure 5: The expected lengths of CI1 and CI2 for Cp in case of ^(50,1) 88 Wararit Panichkitkosolkul n = 10 n = 25 n-1-1-1-1-r 1.0 1.2 1.4 1.6 1.8 2.0 Cp n-1-1-1-1-r 1.0 1.2 1.4 1.6 1.8 2.0 Cp n = 50 n = 100 n-r 1.0 1 l-1-1—r 2 1.4 1.6 1.8 2.0 Cp 1.0 1.2 1.4 1.6 1.8 2.0 Cp Figure 6: The expected lengths of CI1 and CI2 for C in case of Gamma(4,2) + 48 n = 10 n = 25 n-1-1-1-1—r 1.0 1.2 1.4 1.6 1.8 2.0 Cp n-1-1-1—i-r 1.0 1.2 1.4 1.6 1.8 2.0 Cp n = 50 n = 100 1-r 1.0 1 l-1-1-r 2 1.4 1.6 1.8 2.0 Cp 1-1-1-1-1-r 1.0 1.2 1.4 1.6 1.8 2.0 Cp Figure 7: The expected lengths of CI1 and CI2 for Cp in case of Gamma(0.75,0.867) + 49.1340 Confidence Interval for the Process Capability Index 89 n = 10 n-1-1-1-1-r 1.0 1.2 1.4 1.6 1.8 2.0 Cp n = 25 n-1-1-1-1-r 1.0 1.2 1.4 1.6 1.8 2.0 Cp n = 50 n = 100 n-1-1-1-1—r 1.0 1.2 1.4 1.6 1.8 2.0 Cp n-1-1-1—I-r 1.0 1.2 1.4 1.6 1.8 2.0 Cp Figure 8: The expected lengths of CI1 and CI2 for C in case of Gamma(0.25,0.5) + 49.5 5 Conclusions The existing confidence interval for the capability index Cp based on the confidence interval for the standard deviation was based on a normal distribution. However, the underlying distribution may be non-normal or skewed in some circumstances. A confidence interval for the capability index Cp based on the bootstrap-t confidence interval for the standard deviation was developed. The proposed confidence intervals were compared with the existing confidence interval through a Monte Carlo simulation study. The proposed confidence interval proved to be better than the existing confidence interval in terms of the coverage probability when the data have a coefficient of skewness > 2. On the other hand, when the data are symmetrical or have a coefficient of skewness < 2, the estimated coverage probability of the existing confidence interval can be close to the nominal level. 90 Wararit Panichkitkosolkul Appendix: Source R code for all confidence intervals CI1 <-function (x,LSL,USL,alpha) { n <- length(x) S <- sd(x) chisql <- qchisq(alpha/2,df=n-1) chisq2 <- qchisq(1-alpha/2,df=n-1) K <- (USL-LSL)/(6*S) ci.low <- K*sqrt(chisq1/(n-1)) ci.up <- K*sqrt(chisq2/(n-1)) out <- cbind(ci.low,ci.up) return(out) } CI2 <-function (x,LSL,USL,alpha) { n <- length(x) s2 <- var(x) percentile.T.S <- percentile.T.star(x,alpha) T1 <- percentile.T.S[1] T2 <- percentile.T.S[2] K1 <- (USL-LSL)/6 K2 <- s2*sqrt(2*(n-1)) ci.low <- K1*(K2/(2*T1+sqrt(2*(n-1 ))))A(-1/2) ci.up <- K1*(K2/(2*T2+sqrt(2*(n-1))))A(-1/2) out <- cbind(ci.low,ci.up) return(out) } percentile.T.star <-function (x,alpha) { B <- 1000 n <- length(x) S2 <- var(x) T.star <- numeric(B) for (i in 1:B){ xs <- sample(x,n,replace=TRUE) s2.star <- var(xs) T.star[i] <- sqrt((n- 1)/2)*((s2.star/S2)-1) } T1 <- quantile(T.star,probs=alpha) T2 <- quantile(T.star,probs=1-alpha) out <- cbind(T1,T2) return(out) } Confidence Interval for the Process Capability Index 91 Acknowledgements The author would like to thank the anonymous referees for their helpful comments, which resulted in an improved paper. The author is also thankful for the support in the form of the research funds awarded by Thammasat University. References [1] Bittanti, S., Lovera, M. and Moiraghi, L. (1998): Application of non-normal process capability indices to semiconductor quality control. IEEE Transactions on Semiconductor Manufacturing, 11, 296-303. [2] Casella, G. and Berger, R.L. (2001): Statistical Inference. Duxbury Press: Pacific Grove. [3] Chen, K.S. and Pearn, W.L. (1997): An application of non-normal process capability indices. Quality and Reliability Engineering International, 13, 335360. [4] Cojbasic, V. and Tomovic, A. (2007): Nonparametric confidence intervals for population variances of one sample and the difference of variances of two samples. Computational Statistics & Data Analysis, 51, 5562-5578. [5] Ding, J. (2004): A model of estimating process capability index from the first four moments of non-normal data. Quality and Reliability Engineering International, 20, 787-805. [6] Efron, B. (1979): Bootstrap methods: Another look at the jackknife. Annals of Statistics, 7, 1-26. [7] Efron, B. and Tibshirani, R.J. (1993): An Introduction to the Bootstrap. Chapman & Hall: New York. [8] Ihaka, R. and Gentleman, R. (1996): R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics, 5, 299-314. [9] Kane, V.E. (1986): Process Capability Indices. Journal of Quality Technology, 18, 41-52. [10] Kotz, S. and Johnson, N.L. (1993): Process Capability Indices. London: Chapman & Hall. [11] Kotz, S. and Lovelace, C.R. (1998): Process Capability Indices in Theory and Practice. Arnold: London. [12] Pearn, W.L. and Kotz, S. (2006): Encyclopedia and Handbook of Process Capability Indices: A Comprehensive Exposition of Quality Control Measures. Singapore: World Scientific. [13] Smithson, M. (2001): Correct confidence intervals for various regression effect sizes and parameters: the importance of noncentral distributions in computing intervals. Educational and Psychological Measurement, 61, 605632. 92 Wararit Panichkitkosolkul [14] Steiger, J.H. (2004): Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis. Psychological Methods, 9, 164-182. [15] Thompson, B. (2002): What future quantitative social science research could look like: confidence intervals for effect sizes. Educational Researcher, 31, 25-32. [16] Tosasukul, J., Budsaba, K. and Volodin, A. (2009): Dependent bootstrap confidence intervals for a population mean. Thailand Statistician, 7, 43-51. [17] Wu, H.-H., Swain, J.J., Farrington, P.A., and Messimer, S.L. (1999): A weighted variance capability index for general non-normal processes. Quality and Reliability Engineering International, 15, 397-402. [18] Zhang, J. (2010): Conditional confidence intervals of process capability indices following rejection of preliminary tests. Ph.D. Thesis, The University of Texas at Arlington, USA. CONTENTS Metodoloski zvezki, Vol. 11, No. 93, 2014 Investments in Renewable Energy Sources in OPEC Members: a Dynamic Panel Approach Antonio A. Romano1 and Giuseppe Scandurra2 Abstract In this paper we analyze the key factors promoting the investments in renewable energy sources in a panel dataset of Petroleum Exporting Countries (OPEC) members. To address these issues, a dynamic panel analysis of renewable investments in the sample of OPEC with distinct economic and social structures, in the years between 1980 and 2009, is proposed. Results confirm that key factors promoting investments in renewable energy sources are similar to other studies which include more developed countries. However, lack of grants and/or incentives to promote the installations of new renewable power plants is a limit for the future and sustainable development of these countries. 1 Introduction Renewable Energy Sources (RES) are becoming increasingly important in the energy mix of countries, because of their ability to limit the environmental impact of energy production and counter the gradual appreciation of the raw materials used in the process of traditional generation based on gas and / or oil power plants. The centrality represented by investments in renewable sources is confirmed by the attention by the international scientific community in recent years. Sadorsky (2009) studied the relationship between renewable energy sources (wind, solar and geothermal power, wood and wastes) and economic growth in a panel framework of 18 emerging economies for the period 1994-2003 and found that increases in real GDP had a positive and statistically significant effect on renewable energy consumption per capita. Wolde-Rufael (2012) analyzes the causal nexus between 1 Department of Management Studies and Quantitative Methods, University of Naples "Parthenope", via Generale Parisi, 13, 80132 Napoli, Italy; antonio.romano@uniparthenope.it. 2 Department of Management Studies and Quantitative Methods, University of Naples "Parthenope", via Generale Parisi, 13, 80132 Napoli, Italy; giuseppe.scandurra@uniparthenope.it. 94 Antonio A. Romano and Giuseppe Scandurra nuclear consumption and GDP. Yuksel (2010) and Baris and Kucukali (2012) analyze RES deployment in Turkey and find that, thanks to the potential for renewable use, Turkey is working towards a clean and sustainable energy development. Menz and Vachon (2006) and Carley (2009) study the renewable investments in the USA, the former with a regression into countries and the latter using a panel regression. Marques et al. (2010) analyze the drivers promoting renewable energy in European countries and finds that lobbies of traditional energy source and CO2 emission restrain renewable deployment. Evidently, the need for economic growth suggests an investment that supports, but does not replace, the before installed capacity. Romano and Scandurra (forthcoming-a) investigate the drivers of investments in Renewable sources in panel of OECD countries and including some development countries and the divergences in countries that produce electricity using or not using nuclear power plants while the same authors (forthcoming-b), in a forthcoming paper, explore the drivers promoting the investments in renewable energy sources and the divergences on the basis of development stage of the countries employing a large sample of 60 countries split into 3 different sub-samples, following the classification proposed by World Bank (low income and lower middle income; upper middle income; high income). Gan and Smith (2011) identify key factors that may have driven the differences in the shares of renewable energy in total primary energy supply among OECD countries for renewable energy in general and bioenergy in particular. Masini and Menichetti (2012) propose and test a conceptual model in order to analyze factors affecting the investor decisions and the relationship between the investments in RES and the portfolio performances. The need to meet the demand for energy and environmental sensitivity leads policy makers to plan further investments in generation plants based on renewable sources. However, despite the exponential growth in the production of energy from renewable sources in recent years, yet most of the energy demand is met through the use of fossil fuels (IEA, 2012). Currently there is great interest in development of RES due to the prospect of the all available of reserves of fossil fuel getting depleted and the environment pollution caused by burning of fossil fuel. However there are some disadvantages of using renewable energy. These are described below. • Availability of fuel obtained from plants that can be used as economical energy practically is limited. Though lot of research and development activities is going on around to world to develop plants that could provide suitable fuels economically and in sufficient quantities. • The total potential of renewable energy sources as wind power and tidal power is limited and/or intermittent. • The current capital cost for equipment to convert renewable energy such as solar, wind and tide is very high. Investments in Renewable Energy Sources in OPEC Members 95 • Plant for generating power from wind, and tides can be located only in places where suitable conditions of tide or wind exist. • The plant for generating energy from sun light, wind and solar energy have to be spread around large areas. • Solar power is dependent on availability of sunlight. Thus the availability of power fluctuates from zero to maximum every day. • There have been some allegations that large scale use of wind power can interfere pattern of wind flow and disturb the set weather pattern. Use of hydro power is already known change the pattern of silting in rivers. With this in mind, we analyze the drivers of investment in renewable energy sources in Petroleum Exporting Countries (OPEC). OPEC is a permanent, intergovernmental Organization, created on 1960 by Iran, Iraq, Kuwait, Saudi Arabia and Venezuela. The organization now has 12 members having since been joined by Algeria, Angola, Ecuador, Libya, Nigeria, Qatar and the United Arab Emirates. The objective is to co-ordinate and unifies petroleum policies among Member Countries in order to secure fair and stable prices for petroleum producers; an efficient, economic and regular supply of petroleum consuming nations; and a fair return on capital to those investing industry. In this paper we analyze the determinants of investments in renewable sources (hydroelectric and other renewable sources) and the divergences in the composition of the energy mix of countries. In practice, we test the impact of key factors in renewables, highlighting the progressive adaptation to the changing energy needs. This paper addresses these issues by means of a dynamic panel analysis of the renewable investment in a sample of OPEC countries with distinct economic and social structures as well as different levels of economic development. The data are the annual time series from 1980 to 2009. In the model proposed we include the main policy, environmental, socio -economic and generation factors. We use a dynamic specification of the equation that takes into account past investments in renewable energy sources. A widely used methodology for dynamic panel modeling applies Generalized Method of Moments (GMM) estimators proposed by Arellano and Bond (1991). In particular, we try to understand if RES significantly contribute to climate change and if OPEC characterized by a large availability of fossil fuel invests in RES. The organization of the paper is as follows: Section 2 describes data; Section 3 we briefly explain the method proposed. Section 4 reports the model, the empirical results and discusses the policy implications. Section 5 concludes. 96 Antonio A. Romano and Giuseppe Scandurra 2 Data The data used in this paper are from U.S. Energy Information Administration (EIA) and International Energy Agency (IEA) databases. Following the literature (e.g. Carley, 2009; Marques and Fuinhas, 2011), the explanatory variables try to capture main socioeconomic, political and environmental factors from which investment decisions originate. For the environmental factors we consider the per capita Carbon Dioxide Emissions (CO2) from the Consumption of Energy. CO2 emission is one of the main factors of the greenhouse gas (GHG) effects and it could be considered as a proxy of environmental degradation and not the only responsible. The expected results are estimates with a significant positive effect. The presence of a negative effect emphasizes the persistence of an economy tied to fossil fuels, which are still unable to replace the traditional energy sources. The last class of factors (Socioeconomic) includes per capita GDP, per capita Consumption of Energy and a proxy for the energy security of supply. The GDP is directly related to energy consumption (Sadorsky, 2009). The per capita Consumption of Electricity is considered a proxy for economic development of the country (e.g. Toklu, 2011) but it also represents the evolution of energy demand. The need to meet the energy demand can lead to the creation of new power plants based on RES, increasing investment. However, if the increasing demand is met through traditional power plants based on fossil fuel, then the effect on investment will be negative. A similar argument applies to energy security, approximated by the degree of dependence on foreign supplies of electricity. The need to increase their share of production (reducing the energy bill) and to reduce dependence could increase investment in RES. Considering the main production of the countries, we include also the annual oil extraction. The expected result is an estimate with a significant positive effect. The increasing in oil extraction can suggest to countries to increase the investment in RES. Various forms of incentives are currently adopted and many of those directly affected by the wealth of countries, of which we have detailed information3. However, there is a lack of information about the availability of grant to promote the renewable in the OPEC countries. In particular, seems that these countries, at the best of our knowledge, do not provide any incentives for renewable investments. For this reason we do not include a policy variable. In order to reduce variability, GDP, EI, electricity consumption, oil supply and CO2 are expressed through natural logarithm. The analysis of data on generation sources (see Table 1) in the dataset considered (OPEC) highlight different patterns in the countries: • Some countries do not have generation based on RES (Kuwait; Libya; Qatar, Saudi Arabia). • Angola, Ecuador and Venezuela, generate most of their electricity from RES. 3 For example, the European Commission with the Directive 2001/77/EC aim to promote the electricity produced from renewable energy sources. Investments in Renewable Energy Sources in OPEC Members 97 • Iran and Nigeria generate an appreciable share of electricity from RES. The United Arab Emirates have a small share of generation from RES, since 2009, when the first solar power plants were put into operation. In the entire sample we observe, however, that the generation from RES is obtained almost entirely from hydroelectric plants. Given the great availability of fossil fuels for the production of electrical energy, these countries have little considered the possibility of generation sources based on renewable. Considering the generation share from RES in the countries included in our dataset, we reduce its sectional dimension, analyzing only countries that generate electricity from RES. In addition, Iraq has not been included due to missing data in the GDP series. The countries we have included in the final sample are: Algeria, Angola, Ecuador, Iran, Nigeria and Venezuela. Table 1: Mean Electricity generation by sources and countries (1980 - 2009). Countries Share o f total renewable power generation (%) Share of renewable - not based on hydroelectric power plants (%) Share of thermal power generation (%) Algeria 1.88 0 98.22 Angola 65.60 0 34.40 Ecuador 64.70 0.54 35.30 Iran 11.86 0.01 88.14 Iraq 5.00 0 95.00 Kuwait 0 0 1 Libya 0 0 1 Nigeria 34.56 0 65.44 Qatar 0 0 1 Saudi Arabia 0 0 1 United Arab Emirates 0.99 0.01 99.00 Venezuela 64.39 0 35.61 Different ways to evaluate the development of RES are proposed in literature. Bird et al. (2005) measure the total amount of renewable energy produced while Marques et al. (2010) use the contribution of renewable to energy supply. Following Romano and Scandurra (forthcoming-a) we explain the investment in RES (ShRen) as the ratio between Renewable Generation and Total Net Electricity Generation. The share 98 Antonio A. Romano and Giuseppe Scandurra of Renewable Electricity Net Generation can be considered a proxy of investments in RES. 3 Method Dynamic panel data (DPD) models contain one or more lagged dependent variables, allowing for the modeling of a partial adjustment mechanism, i.e.: Yi,t = 5yi,t-i + *itP + ui,t (3.1) where for country i (i=l,... ,N) at time t (t=l,... ,T), 5 is a scalar, yi t is the outcome variable, yi,t-i is the lagged dependent variable, xjt is the vector of independent variables while the error term Ui,t = a{ + tu (3.2) follows a one - way error component model where at denote a country - specific effect, T( t denotes a observation - specific effect and at ~IID(0, a2 a) and tljt~IID(0, a2,). , The dynamic panel data regression described in (3.1) and (3.2) is characterized by two sources of persistence over time: autocorrelation due to the presence of a lagged dependent variable among the regressors and individual effects characterizing the heterogeneity among the individuals. Several econometric problems may arise from estimating the parameters in eq. (3.1) (cf. Hsiao, 2003): i) the variables in xit are assumed to be endogenous; ii) timeinvariant country characteristics (fixed effects) may be correlated with the explanatory variables; iii) the presence of the lagged dependent variable yi,t-1 gives rise to autocorrelation. With these assumptions, the estimations with fixed effects (OLS) or random effects (GLS) would not be appropriate since the obtained estimates would be biased. Since yi,t is a function of ai, it immediately follows that yi,t-i is also a function of ai. Therefore, yi,t-i, a right-hand regressor in (3.1), is correlated with the error term. This renders the OLS estimator biased and inconsistent even if xit are not serially correlated. One way to solve this problem is to estimate a dynamic panel data model based on the Generalized Method of Moments (GMM) estimator proposed by Arellano and Bond (1991). The GMM procedure is more efficient than the Anderson and Hsiao (1982) estimator, while Ahn and Schmidt (1995) derived additional nonlinear moment restrictions not exploited by the Arellano and Bond (1991) GMM estimator. Arellano and Bond argue that the Anderson-Hsiao estimator, while consistent, fails to take all of the potential orthogonality conditions into account. A key aspect of the Investments in Renewable Energy Sources in OPEC Members 99 method proposed by Arellano and Bond is the assumption that the necessary instruments are 'internal': that is, based on lagged values of the instrumented variable(s) (Baltagi, 2005). The estimators allow the inclusion of external instruments as well. For instance, let us consider a simple autoregressive model with no regressors: where ui t = at + xi t with at ~IID(0, o2a) and ti,t~IID(0, o2r), independent of each other and among themselves. In order to get a consistent estimate of 5 as N^ œ with T fixed, we first difference (3.3) to eliminate the individual effects and note that (ri t — Tijt_1) is MA(1) with unit root. Equation (3.4) is equivalent to a system of simultaneous equations with (T-2) equations with N observations, or: where the instruments are uncorrelated with the error terms. The variance\covariance of the error term can be expressed in the following matrix: y\x = sYi,t-1 + Ui.t (3.3) kyi,t = Yi,t- yi,t-i = s(yijt-i- yi,t-ï) + (ji,t - 0 = 5Ayijt_! + Arut t = 3,..., T (3.4) instruments: yil instruments: yil ; yi2 -2 -10^ 0 0 0 --1 2 -1^0 0 0 V = E( AtîATD = of 0 0 0 - 0 0 0 -1 2 -1 0 -1 2 - is (T-2) x (T-2), since (riit — Tijt_!) is MA(1) with unit root. Define the (T — 2) x C matrix, 100 Antonio A. Romano and Giuseppe Scandurra Tii 0 0 0 0 0. 0 0 - 0 0 yn yi 2 0 0 0. 0 0 - 0 0 0 0 yn yn yi3 .. 0 0 - 0 - 0 0 0 0 0 0 ... y;i yi2 ... yiT-2. where C = Zj=i!J/ and lines contain the instruments. Then, the N(T — 2) x C matrix of instruments is Z = [Z[, ... , Z'N]' and the moment equations described above are given by E(Z(AT(3 ) = 0. Premultiplying the differenced equation (3.4) in vector form by Z', one gets Z'Ay = Z'(Ay_1)S + Z' At (3.5) Performing GLS on (3.5) one gets the Arellano and Bond (1991) preliminary one-step consistent estimator: ^ = [(Ay.1)'Z(Z'(In ® 7)Z)-1Z'(Ay_1)]-1[(Ay_1)'Z(Z'(/„ ® V)Z)~1Z'(Ay)] (3.6) One can gets the two-step Arellano and Bond (1991) GMM estimator by replacing the matrix of the second population moments with that of the corresponding second sample moments. For a more detailed discussion see e.g. Baltagi (2005). 4 Model and discussion In this paper we employ a panel dataset including 6 OPEC countries from 1980 to 20094. There are three main issues that can be solved using a panel dataset. In fact, a panel dataset allows us to have more degrees of freedom than with time-series or cross-sectional data, and to control for omitted variable bias and reduce the problem of multi-collinearity, hence improving the accuracy of parameter estimates (Hsiao, 2003), having more informative data. Furthermore, annual data avoids the seasonality problems. Since static regression models can suffer from a number of problems, including structural instability and spurious regression, we employ a 4 Arellano and Bond's (1991) GMM estimator is consistent for large N (number of countries) with T fixed. In our empirical research, Initially, the current sample was broader and included all of the OPEC members. Considering that some of them do not have sources of generation based on renewable energy, or SHRen = 0 in he analysed years, we employ a subset of countries. The sectional component of the error remains in the variables and must thus refer to the wholeness of the sample. Furthermore, we tries to use only the most recent instruments (but also simple OLS estimation) but without sensible variations in the significance. Investments in Renewable Energy Sources in OPEC Members 101 dynamic analysis that allows for slow adjustment. The dynamic model captures the "persistence effect" on investment in RES5. The assumed model is as follows: K ShReni t = c + (1 + y)ShReni t-1 + ^ p1kAlnGDPi t-k k=0 K K + ^