Control Charts for Skewed Distributions: Weibull, Gamma, and Lognormal Karagoz Derya and Hamurkaroglu Canan1 Abstract In this paper the control limits of X and R control charts for skewed distributions are obtained by considering the classic, the weighted variance (WV), the weighted standard deviations (WSD) and the skewness correction (SC) methods. These methods are compared by using Monte Carlo simulation. Type I risk probabilities of these control charts are compared with respect to different subgroup sizes for skewed distributions which are Weibull, gamma and lognormal. Simulation results show that Type I risk of SC method is less than that of other methods. When the distribution is approximately symmetric, then the type I risks of Shewhart, WV, WSD, and SC X charts are comparable, while the SC R chart has a noticeable smaller Type I risk. 1 Introduction Control charts are among the most commonly used and powerful tools in statistical process control (1) to learn about a process, (2) to monitor a process for control and (3) to improve it sequentially. They are now widely accepted and applied in industry. The conventional Shewhart X and R control charts are based on the assumption that the distribution of the quality characteristic (also called process distribution) is normal or approximately normal. However, in many situations the normality assumption of process population is not valid. One case is that the distribution is skewed (e.g., Bai and Choi (1995), Choobineh and Branting (1986), and Nelson (1979)). For instance, the distributions of measurements in chemical processes, semiconductor processes, cutting tool wear processes and observations on lifetimes in accelerated life test samples are often skewed. When the quality variable has a skewed distribution, it might be misleading to observe the process by using the Shewhart X and R control charts. The usage of Shewhart control charts in skewed distributions causes an increase of Type 1 risk when the skewness increases because of the variability in population. For this reason, three methods which use the asymmetric control limits were proposed as an alternative to the classical method. The first one is the weighted variance (WV) method proposed by Choobineh and Ballard (1987), which based on the semivariance approximation of Choobineh and Branting 1 Department of Statistics, Hacettepe University, Beytepe, Turkey; deryacal@hacettepe.edu.tr, caca@hacettepe.edu.tr (1986). They obtained the asymmetric control limits for X and R charts for skewed distributions based on the standard deviation of sample means and ranges. Bai and Choi (1995) also proposed a simple heuristic method of constructing X and R charts using the WV method. The second one is the weighted standard deviations (WSD) proposed by Chang and Bai (2001). This method is used to construct X, cumulative sum and exponentially weighted moving average control charts for skewed distributions and to obtain control limits by decomposing the standard deviation into two parts. The last one is a skewness correction (SC) method proposed by Chan and Cui (2003) for constructing X and R chart taking into consideration the degree of skewness of the process distribution, with no assumptions on the distribution. The Type I risks, the probabilities of a subgroup X and R falling outside the ±3 sigma control limits when the process is in-control, are then 0.27%. If the process is in control (and the process statistic is normal), 99.73% of all the points will fall between the control limits. However, about 0.0027 of all control points will be false alarms and have no assignable cause of variation, due to the control limits. Letting X denote the value of a process characteristic, if the system of chance causes generates a variation in X that follows the normal distribution, the 0.001 probability limits will be very close to the 3 limits. From normal tables we glean that the 3 in one direction is 0.00135, or in both directions 0.0027. By using Monte Carlo Simulation, the type I risks of X and R control charts based on classic Shewhart, WV, WSD and SC methods are compared. The Weibull, gamma and lognormal distributions are chosen since they can represent a wide variety of shapes from nearly symmetric to highly skewed. Based on the simulation study results, as the skewness, Type I risk of SC method is less than that of others methods. When the distribution is approximately symmetric, then the Type I risks of SC, WSD, WV and Shewhart charts are comparable, while the SC R chart has a noticeable smaller Type I risk. The remainder of the paper is organized as follows. In Section 2.1, Section 2.2 and Section 2.3 the control limits of X and R control charts for skewed populations by considering WA, WSD and SC methods are obtained respectively. In Section 3 the simulation study is given to compare the Type I risk probabilities of these control charts by using Monte Carlo simulation with respect to different subgroup sizes for skewed distributions which are Weibull, gamma and lognormal. Section 4 concludes and formulates some ideas for further research. 2 Methods The aim of this section is to give the control limits of X control charts for skewed populations by considering the classic, WD, WSD and SC methods and to obtain the control limits of R control charts by considering the classic, WV and SC methods. 2.1 WV Method The WV method with no assumptions on the population adjusts the control limits according to the skewness of the underlying population. The probability that the quality variable X will be less than or equal to its mean ßX is Px = P(X < ßX). If the parameters of the process are known: The control limits of X control chart are given by: UCLx = ßx + ^ LCLx = — ~~ Px) where oX is the standard deviation of X ( Bai and Choi 1995 ). The control limits of R control chart are given by, UCLR = ßR + :1CJR^/2PX LCLr = [ßR - 3aRy/2(l - Px)f { ' } where ///> and or is the mean and standard deviation of the range of a sample size n and if IPr — 3<7ßy/2(l — Px)] is equal a, [a]+ denotes max(0, a) ( Bai and Choi 1995 ) . Generally in practice, PX and the process parameters are not known. In this case, these must be estimated. The probability PX can be estimated by using the number of = ^ $(X — X) observations less than or equal to X : Px = i=1 ]=1nk v-— where k and n are the number of samples and the number of observations in a subgroup, and 8(X) = 1 for X > 0, 0 otherwise. Usually, ßx is estimated by the grand mean of the subgroup means X and is estimated by the mean of the subgroup ranges R. If the parameters of the process are unknown: The control limits of X control chart are given by, UCL-x = 1 + 3^% Vwx = 1 + WuR (2.3) LCLx = X - 3^^/2(1 -Px) = X- WlR ( Bai and Choi 1995 ). The control limits of R control chart are given by, UCLr=R LCLr = R 1 + 3§v/2^ Vu R (2.4) VlR where dg and dg are the control chart constant for XX and R charts based on WV. These constants which are defined as the mean and standard deviation of relative range have been obtained under the non-normality assumption. These values can be computed via numerical integration once the distribution is specified ( Bai and Choi 1995 ). 2.2 WSD Method In WSD method, like WV method, a skewed distribution can be decomposed into two parts at its mean and each part is used to create a new symmetric distribution adjusted in accordance with the degree of skewness. If the parameters of the process are known, the control limits of the X charts are given by: LCLx = ß~ 3^2(1 - P) W where a is the standard deviation of skewed distribution ( Chang and Bai 2001 ). If the parameters of the process are unknown, the control limits of the X charts are given by: UCLX X = X + 3^7Z2P = X + WSUR LCLx = X- 3^2(1 -P) = X- WSL R (2.6) where d*2*, WSv and WSL are control chart constants for WSD method. The control chart constant d*2* can be obtained d*2* — Pd2 (2n (1 - P)) + (1 - P) d2 (2nP) (2.7) where d2(n) is d2 when the sample size is n. When the underlying distribution is symmetric d*2* is equal to d2 ( Chang and Bai 2001 ). 2.3 SC Method SC method is used for constructing the X and R control charts for skewed distributions. It's asymmetric control limits are obtained by taking into consideration the degree of skewness estimated from subgroups, and with no assumptions on the distributions. If the parameters of the process are known, the control limits of the X control chart are given by: U CLx = ßx + (3 + cl)ax/Vn LCLx = ßx + (-3 + ct)ax/V^ { ' ' (Chan and Cui 2003 ). If the parameters of the process are known, the control limits of the R control chart are given by: U CLr = ßR + (3 + dl)an LCLr = ßr +(-3 + dl)ar (2.9) (Chan and Cui 2003 ). In Equation (2.8) and (2.9) C4 and d*4 are the control chart constants for the SC method. LCLr is equal to zero if it is negative. If the underlying distribution is symmetric, c4 = 0 and the X chart reduce to the Shewhart chart. The constants c4 and d4 are obtained as follows: * I k3(X) 1 _i_n ob'* (2.10) Ö4 1+0.2 k'j(R) where k3(X) is the skewness of the subgroup mean X and k3 (R) is the skewness of the subgroup range R (Chan and Cui 2003 ). If the parameters of the process are unknown, the control limits of the X control chart are given by: UCL^X + iS + f^^^X + A^R ML* = Ž+ (-3+ Ž-Alii (Chan and Cui 2003 ). If the parameters of the process are unknown, the control limits of the X control chart are given by: UCLR = LCLr = rt* 1 + (3 + dl) l + (-3 + d*4)% R = D*R + (2.12) R = D* R (Chan and Cui 2003 ). 3 Simulation study By using Monte Carlo Simulation, the type I risks of X and R control charts based on classic Shewhart, WV, WSD and SC methods are compared. The Weibull, gamma and lognormal distributions are chosen since they can represent a wide variety of shapes from nearly symmetric to highly skewed. • The probability density function of the Weibull distribution is defined as f (x\ß, A) = ßAßxß-1 exp(-xA)ß for x > 0, where ß is shape parameter and A is a scale parameter. • The probability density function of the gamma distribution is defined as 1 x for x > 0, where a is a shape parameter and ß is a scale parameter. • The probability density function of the lognormal distribution is defined as 1 / (ln(x) - f(x\a,ß) =-7=exp(--—-) xa\J 2n 2a for x > 0, where a is a scale parameter and ß is a location parameter. In the application, the quality variable X has the Weibull distribution with shape parameter ß and scale parameter A =1, the gamma distribution with shape a and scale parameter ß =1, the lognormal distribution with scale a and location parameter ß = 0. The scale parameter of the Weibull distribution A =1, the scale parameter of the gamma distribution ß =1 and the location parameter of the lognormal distribution ß = 0 are chosen because of the skewness does not depend on them. The values of PX , the skewness and the parameters of distributions are given in Table 1. For simulation study X and R charts constants Wu, WL, Vu and VL of the WV method for the selected combinations of n and PX are obtained by Table 2 gives X and R charts constants of the WV method. When PX is equal to 0.50, X chart constants Wu and WL are the same. The X charts constants Wu for the case of PX < 0.50 are the same as Wl for 1 - Px (Bai and Choi 1995). Table 1: The values of PX, the skewness and the parameters of distributions. Lognormal Weibull Gamma ks (T Px ß Px a Px 0.50 0.16 0.53 2.15 0.54 16.00 0.53 1.00 0.32 0.56 1.57 0.57 4.00 0.57 1.50 0.44 0.59 1.20 0.61 1.80 0.60 2.00 0.54 0.61 1.00 0.63 1.00 0.63 2.50 0.66 0.63 0.86 0.66 0.64 0.66 3.00 0.72 0.64 0.77 0.68 0.44 0.69 Table 3 gives the control chart constants d*2* for selected combinations of n and PX when the underlying distribution is Weibull, gamma and lognormal. The control chart constants d*2* were obtained by Chang and Bai (2001). Table 4 gives the values of the constants A*v and A*L for X chart, Dg and Dg for R chart for selected combinations of n and k3. The simulation consists of two segments. The steps of each segment are described below. Segment 1: 1.a. Generate n i.i.d. Weibull (ß, 1), gamma(a, 1) and lognormal(0, a) varieties for n = 2, 3, 5. 1.b. Repeat step 1.a 30 times (k = 30). 1.c. Compute the control limits using the Equations (2.3) and (2.4) for the weighted variance method, using the Equations (2.6) for the weighted standard deviation method and using the Equations (2.11) and (2.12) for the skewness correction method. Segment 2: 2.a. Generate n i.i.d. Weibull(ß, 1), gamma(a, 1) and lognormal (0, a) varieties using the procedure of step 1.a. 2.b. Repeat step 2.a 100 times (k = 100). 2.c. Compute the sample statistics for X and R charts for four methods. 2.d. Record whether the sample statistics calculated in step 2.c are within the control limits of step 1.c. or not for all methods. 2.e. Repeat steps 1.a through 2.d, 10000 times and obtain an average Type I risk for each method. The graphs of the average Type I risks of the four methods estimated by using Monte Carlo simulation are given in the following Figures for selected combinations of n and distributions. As seen from figures, as the skewness, Type I risk of SC method is less than that of others methods. When the distribution is approximately symmetric, then the Type I risks of SC, WSD, WV and Shewhart charts are comparable, while the SC R chart has a noticeable smaller Type I risk. Table 2: X and R charts constants of the WV method. 0 § 1 I t/3 WL Wu Vu h Px n = 2 n = 3 n = 5 n = 2 n = 3 n = 5 n = 2 n = 3 n = 5 n = 2 n = 3 n = 5 0.50 0.54 1.83 0.99 0.56 1.97 1.08 0.61 0.00 0.00 0.00 3.43 2.72 2.25 1.00 0.57 1.82 0.98 0.56 2.09 1.13 0.64 0.00 0.00 0.00 3.67 2.90 2.45 w 1.50 0.61 1.87 1.01 0.57 2.45 1.32 0.74 0.00 0.00 0.00 4.06 3.36 2.82 2.00 0.63 1.87 1.01 0.57 2.45 1.32 0.74 0.00 0.00 0.00 4.51 3.62 3.06 2.50 0.66 1.96 1.08 0.58 2.74 1.49 0.81 0.00 0.00 0.00 5.23 4.16 3.52 3.00 0.68 2.04 1.09 0.61 2.98 1.59 0.88 0.00 0.00 0.00 5.71 4.64 4.02 0.50 0.53 1.85 1.00 0.57 1.95 1.07 0.60 0.00 0.00 0.00 3.39 2.68 2.22 1.00 0.57 1.82 0.98 0.56 2.09 1.13 0.64 0.00 0.00 0.00 3.67 2.90 2.45 G 1.50 0.60 1.84 0.99 0.56 2.26 1.22 0.68 0.00 0.00 0.00 4.06 3.23 2.70 2.00 0.63 1.87 1.01 0.57 2.45 1.32 0.74 0.00 0.00 0.00 4.51 3.62 3.06 2.50 0.66 1.96 1.08 0.58 2.74 1.49 0.81 0.00 0.00 0.00 5.23 4.16 3.52 3.00 0.69 2.85 1.13 0.63 3.12 1.69 0.94 0.00 0.00 0.00 5.90 5.03 4.34 0.50 0.53 1.85 1.00 0.57 1.95 1.07 0.60 0.00 0.00 0.00 3.39 2.68 2.22 1.00 0.56 1.81 0.98 0.56 2.04 1.11 0.63 0.00 0.00 0.00 3.58 2.83 2.38 L 1.50 0.59 1.83 0.98 0.56 2.20 1.18 0.67 0.00 0.00 0.00 3.93 3.10 2.61 2.00 0.61 1.85 0.99 0.57 2.31 1.25 0.74 0.00 0.00 0.00 4.18 3.36 2.82 2.50 0.63 1.87 1.01 0.57 2.45 1.32 0.74 0.00 0.00 0.00 4.51 3.62 3.06 3.00 0.64 1.89 1.02 0.57 2.53 1.36 0.76 0.00 0.00 0.00 4.72 3.76 3.18 O Table 3: Control chart constants d2* for WSD method. Weibull Gamma Lognormal ks n = 2 n = 3 n = 5 n = 2 n = 3 n = 5 n = 2 n = 3 n = 5 0.50 1.123 1.685 2.306 1.121 1.681 2.311 1.119 1.679 2.309 1.00 1.096 1.644 2.255 1.090 1.634 2.251 1.094 1.642 2.265 1.50 1.040 1.560 2.154 1.052 1.578 2.180 1.052 1.577 2.188 2.00 1.004 1.505 2.090 1.004 1.505 2.090 1.015 1.522 2.120 2.50 0.940 1.410 1.977 0.947 1.421 1.987 0.970 1.455 2.038 3.00 0.892 1.338 1.889 0.885 1.327 1.874 0.946 1.419 1.994 Table 4: The constants of X and R for the SC method. n = = 2 n= = 3 n= = 5 ks A* D* 4 D* 3 A* D* 4 D* 3 A* D* 4 D* 3 0.50 2.20 1.62 4.26 0.00 1.16 0.90 3.12 0.00 0.65 0.53 2.45 0.15 1.00 2.49 1.57 4.56 0.00 1.31 0.81 3.43 0.00 0.71 0.48 2.75 0.17 1.50 2.78 1.25 4.95 0.00 1.46 0.73 3.82 0.00 0.78 0.45 3.10 0.15 2.00 3.02 1.15 5.32 0.00 1.60 0.68 4.20 0.00 0.85 0.42 3.44 0.11 2.50 3.22 1.23 5.66 0.00 1.71 0.65 4.53 0.00 0.92 0.40 3.75 0.06 3.00 3.39 1.18 5.97 0.00 1.82 0.64 4.82 0.00 0.98 0.39 4.03 0.03 4 Results When the quality variable has a skew distribution, it might be misleading to observe the process by using the Shewhart X and R control charts. Because of the variability in population, usage of Shewhart X and R control charts in skew distributions causes the increase of Type 1 risk when the skewness increases. Therefore, to reflect the variability of the population, the WV, WSD and SC methods which use asymmetric control limits are applied in this study, as an alternative to the classical method. When these methods are compared the results obtained for Weibull, gamma and lognormal distributions are: • The Shewhart chart has the worst performance. As the skewness increases, the type I risks of the Shewhart charts increases too much. • When the distribution is approximately symmetric, then the type I risks of SC, WSD, WV and Shewhart charts are compareble, while the SC R chart a noticible smaller Type I risk. • As the skewness increases, for chart WV gives better results than the Shewhart, WSD better than the Shewhart and WV, SC gives better results than other methods. • As the skewness increases, for R chart WV gives better results than the Shewhart, SC gives better results than the Shewhart and WV. C.CE - D.M - .sc tč 0.05 - £ 0.02 -P □ .01 -0,00 - -sc -WSD -WV -Stiewhart 0.0 1.5 1.S 1,5 i s,: (a) X chart for n=2 0,0 0,5 1,0 1,E 2,0 2,5 3,0 (b) R chart for n=2 Ö.IM ■ ^ 0.02- i/i iE v M- tx * 0.01 ■ 0.00 ■ —sc -■-WSD -o— wv —■—Shewhart 0.0 0,5 1,0 1,5 2.0 2.5 0,03 O.OS ■ m LE -«-SC aj Q.M ■ -HHKW p. č1 0.02 ■ -■*->vShewbart -1 4,-*-^-1 * 0.00 0,0 0,5 1,0 1,5 2,0 2,5 3,0 (c) A' chart for n=3 (d) R chart for n=3 «SC -WV -Shewitart (e) A" chart for n=5 (f) R chart for n=5 Figure 1: Type I risks of X and R charts for Weibull distribution. • The difference in Type I risks of four methods are more pronounced in the R chart than in X chart. • Type I risk of the SC and especially the SC R charts are closer to 0.27% then those of the WSD, WV and Shewhart charts, particularly when k3 increases. • According to the Type I risk there isn't a difference between the Weibull, gamma and lognormal distribution. • According to the Type I risk there isn't a difference based on the samples size (n). The Type I risk of the WSD, Shewhart, and WV methods are the same when the underlying distribution is symmetric. The WSD and WV methods perform significantly (a) X chart for n=2 5,0; ■0,84 ■ C w CZ D.D3 ■ a ft C,02 -i- 0,01 ■ o -sc -WSD -WV -Shs'.'r^art 0,0 0,5 1,0 1,5 2.0 2,5 fe (c) A' chart for n=3 -EC -WSD -WV -StiBwhsrt (e) A' chart for n=5 (b) R chart for n=2 0,03 o.os ■ m LE -«-SC aj 0,04 ■ -HHKW p. č1 0.02 ■ -1 4,-*-^-1 * 0.00 0,0 0,5 1,0 1,5 2,0 2,5 3,0 (d) R chart for n=3 (f) R chart for n=5 Figure 2: Type I risks of X and R charts for Gamma distribution. better than the Shewhart method as the skewness increases, and the WSD method performs better than the WV method for all ranges of skewness. However, the WSD X charts perform better than WV X charts. In particular, when the sample size is small, the WSD method gives significantly better performance, and can be used effectively when the process parameters are unknown. When the process parameters are unknown and have to be estimated from the preliminary run samples, the SC method has a very good robust performance in all the tested distributions. When the distribution is approximately symmetric (i.e., k3 = 0), the Type I risks of the W, WSD and SC X charts are comparable, while the SC R chart has a noticeably smaller Type I risk ; Type I risks of the SC X and in particular the SC R charts are closer (a) X chart for n=2 □.04 i 0,03 -0,03 ■ 0.02 -0.02 -0,01 -0,01 -0,00 ■ o.o a.f 1,0 l.i 2,0 2,5 3.0 (c) A" chart for n=3 (e) A"chart for n=5 0,0 0,5 1,0 1,E 2,0 2.5 3,0 K; (b) R chart for n=2 0,0c O.CE w £ 0.02 y __—* -•—Shewhart 0,01 —*-—*- It,lift C.C 0.5 1 0 1,5 2,0 2,5 3,0 (d) R chart for n=3 -SC -wv Shewhart 0,0 0,5 1,0 1,5 2,3 2,5 3,0 (f) R chart for n=5 Figure 3: Type I risks of X and R charts for Lognormal distribution. to 0.27% than those of the WV charts, especially when k3 increases. Based on W, WSD, and SC methods, X and R control charts are considered. The control limits are asymmetric for skewed distributions. They become the Shewhart X charts when the process distribution is symmetric. A simulation study shows that the Type I risks of the W, WSD and SC methods are compatible for approximately symmetric distributions, and that SC offers considerable improvement over the WV charts when it is desirable for the Type I risk to be close to the conventional 0.27%. References [1] Bai, D.S. and Choi, I.S. (1995): X and R Control charts for skewed populations. Journal Of Quality Technology, 27, 120-131. [2] Burr, I.W. (1967): The effects of non-normality on constants for X and R charts. Industrial Quality Control, 24, 563-569. [3] Chan, L.K. and Cui, H.J. (2003): Skewness correction X and R charts for skewed distributions. Naval Research Logistics, 50, 1-19. [4] Chan, L.K., Hapuarachchi, K.P., and Macpherson, B.D. (1998): Robustness of X and R charts. IEEE Transactions on Reliability, 37, 117-123. [5] Chang, Y.S. and Bai, D.S. (2001): Control charts for positively skewed populations with weighted standard deviations. Quality and Reliability Engineering International, 17, 397-406. [6] Choobineh, F. and Branting, D. (1986): A simple approximation for semivariance. European Journal of Operations Research, 27, 364-370. [7] Choobineh, F. and Ballard, J.L. (1987): Control-limits of QC charts for skewed distributions using weighted variance, IEEE Transactions on Reliability, 36, 473477. [8] Cowden, D.J. (1957): Statistical Methods in Quality Control, New Jersey: Prentice-Hall. [9] Duncan, A.J. (1974): Quality Control and Industrial Statistics, Richard D. Irwin, Inc. [10] Montgomery, D.C. (1997), Introduction to Statistical Quality Control, John Wiley&Sons. Inc., USA. [11] Nelson, P.R. (1979): Control charts for weibull processes with standards given. IEEE Transactions on Reliability, 28, 283-387. [12] Yourstone, S.A. and Zimmer, W.J. (1992): Non-normality and the design of control charts for average. Decision Science, 23, 1099-1113.