Metodoloski zvezki, Vol. 11, No. 2, 2014, 79-92
Confidence Interval for the Process Capability Index Cp Based on the Bootstraps Confidence
Interval for the Standard Deviation
Wararit Panichkitkosolkul1
Abstract
This paper proposes a confidence interval for the process capability index based on the bootstrap-^ confidence interval for the standard deviation. A Monte Carlo simulation study was conducted to compare the performance of the proposed confidence interval with the existing confidence interval based on the confidence interval for the standard deviation. Simulation results show that the proposed confidence interval performs well in terms of coverage probability in case of more skewed distributions. On the other hand, the existing confidence interval has a coverage probability close to the nominal level for symmetrical or less skewed distributions. The code to estimate the confidence interval in R language is provided.
1 Introduction
Statistical process quality control has been widely applied in many industries. One of the quality measurement tools used for improvement of quality and productivity is the process capability index (PCI). Process capability indices are practical tools for establishing the relationship between the actual process performance and the manufacturing specifications. Although there are many process capability indices, the most commonly used index is Cp (Kane, 1986; Zhang, 2010). In this paper, we
focus on the process capability index Cp, defined by Kane (1986) as:
C _ USL - LSL	(1)
p _ 6a '	( }
where USL is the upper specification limit, LSL is the lower specification limit, and a is the process standard deviation. The numerator of Cp gives the size of the
1 Department of Mathematics and Statistics, Faculty of Science and Technology, Thammasat University, Thailand; wararit@mathstat.sci.tu.ac.th
80
Wararit Panichkitkosolkul
range over which the process measurements can vary. The denominator gives the size of the range over which the process actually varies (Kotz and Lovelace, 1998). Due to the fact that the process standard deviation is unknown, it must be estimated from the sample data (Xj,..., Xn}. The sample standard deviation S;
/ \1/2 in	\
S =
i=1
(n -1) 1 ^(X; -X)2 is used to estimate the unknown parameter a in
Equation (1). The estimator of the process capability index Cp is therefore
C = USL - LSL p 6S "	1 ^
Although the point estimator of the capability index Cp shown in Equation (2)
can be a useful measure, the confidence interval is more useful. A confidence interval provides much more information about the population characteristic of interest than does a point estimate (e.g., Smitson, 2001; Thompson, 2002; Steiger, 2004). The confidence interval for the capability index Cp is constructed by using a
pivotal quantity Q = (n-1)S2/a2 ~^(2n-1). Therefore, the (1 -a)100°% confidence interval for the capability index Cp is
2	I 2 ^
(3)
where ^/2,n-1 and x2-a/2,n-1 are the (a/2)100th and (1 -a/2)100th percentiles of the central chi-square distribution with n -1 degrees of freedom.
The confidence interval for the process capability index Cp shown in Equation
(3) is to be used for data that are normal. The coverage probability of this confidence interval is close to a nominal value of 1 -a when the data are normally distributed. However, the underlying process distributions are non-normal in many industrial processes. (e.g., Chen and Pearn, 1997; Bittanti et al., 1998; Wu et al., 1999; Chang et al., 2002; Ding, 2004). In these cases, the coverage probability of the confidence interval can be appreciably below 1 -a. Cojbasic and Tomovic (2007) presented a nonparametric confidence interval for the population variance based on ordinary t-statistics combined with the bootstrap method for a skewed distribution. In this paper, we propose a new confidence interval for the process capability index Cp based on the bootstrap-t confidence interval proposed by
Cojbasic and Tomovic (2007).
The paper is organized as follows. In Section 2, the theoretical background of the existing confidence interval for the Cp is discussed. In Section 3, we provide an
analytical formula for the confidence interval for the Cp based on the bootstrap-t
confidence interval for the standard deviation. In Section 4, the performance of the confidence intervals for the Cp are investigated through a Monte Carlo simulation
study. Conclusions are provided in the final section.
Confidence Interval for the Process Capability Index
81
2 Existing confidence interval for the process capability index
Suppose Xi ~N),i = 1,2,...,n, a well-known (1 -a)100% confidence interval for
the population variance a2, using a pivotal quantity Q = (n -1)S2/a2, is (Cojbasic and Loncar 2011)
( n -1) S2	( n -1) S2
(4)
( n -1) S2 2 ( n -1) S2
—-—< u < „ —
X-
Xa
where S2 = (n-1)-1 £(Xt -X)2, and ¿Un-1 and Zl2-a/2,„-1 are the (a/2)100th and
i=1
(1 -a/2)100th percentiles of the central chi-square distribution with n -1 degrees of freedom, respectively. From Equation (4), we have
P
f{ n -1) S2 <u2 < ( n -1) S2 ^
v
X-
a/ 2,n-1
X
a/2,n-1
/
P
P
X a/2,n-1 1 „ X -a/2,n-1 ( n -1) S2 U ( n -1) S2
v
f rx—
I vta/2,n-1
/
V
'( n -1) S2
< 1 < x
/ 2,n-1
( n -1) S2
1 -a
1 -a
1 -a
P
(USL - LSL) |xa/2,„-1 < (USL - LSL) < (USL-LSL) |X-a/2,„-1
'( n -1) S2
6u
l( n -1) S2
P
(USL - LSL) X^nn-1 < C < (USL - LSL) ¡xL/2,n-1
6S
n-1
6S
n-1
1 -a
1 -a.
We obtain a (1 -a)100% confidence interval for the Cp based on the confidence
interval for the standard deviation which is
( n- r~2->
(5)
CI1 =
A lX a/2,n-1 A X 1-a/2,n-1
3 Proposed confidence interval for the process capability index
The bootstrap introduced by Efron (1979) is a computer-based and resampling method for assigning measures of accuracy to statistical estimates (Efron and Tibshirani, 1993). For a sequence of independent and identically distributed (i.i.d.) random variables, the bootstrap procedure can be defined as follows (Tosasukul et al., 2009). Let X1,X2,...,Xn be independently and identically distributed random
82
Wararit Panichkitkosolkul
variables from some distribution with mean / and variance a2. Let the random variables {X*,1 < j < m} be the result of sampling m times with replacement from the n observations X1,X2,...,Xn. The random variables {X*,1 < j < m} are called the bootstrap samples from original data X1, X2,..., Xn. A confidence interval for the population variance can be constructed using the aforementioned pivotal quantity Q = (n - 1)S2/a2. For large sample sizes, a central chi-square distribution with n -1 degrees of freedom can be approximated by a normal distribution with mean n -1 and variance 2(n -1) (Cojbasic and Tomovic, 2007). Therefore, the distribution of the standardized variable
(n -1) S2
- ( n -1)
Z =
S2 -c2
V2(n -1) ^var(S2) converges to a standardized normal distribution as n increases to infinity. The bootstrap confidence interval for the a2 is calculated based on the statistic
S2 -a2
T =
VVârCS2)
where var(S2) is a consistent estimator of the variance of S2. Casella and Berger (2001) have shown the estimator of var(S2) for a non-normal distribution such that
Vàr(S2) =1U -S4
n I n -1
I "	_
and & = -£(X, -X)4.
n i=
After re-sampling B bootstrap samples, in each bootstrap sample we compute the value of the following statistic
T =
S 2 - S2 VVar( S *2)
where S is a bootstrap replication of statistic S
Var(S*2) = 1 f ^4 -nzlS*4
n I n -1
(6)
and
¿4 = — £(X* -X*)4. The (1 -«)100% bootstrap-t confidence intervals for the cr2 is
m
S2^2(n -1) S^2(n -1)
V 2C„2 W2(n -1) ' 2a/2 +42(n -1)
where t'a/2 and t**_a/2 are the (a/2)100iA and (1 -a/2)100th percentiles of T* shown in Equation (6). Additionally, the (1 -a)100% confidence interval for the standard deviation a is
(
21
S V2(n -1) ->/2(n -1)
1/2
1-a/2
2 a
S V2(n -1)
V2(n -1)
1/2
(7)
Then, from Equation (7), we construct the confidence interval for the Cp based on the bootstrap-t confidence interval for the standard deviation which is
Confidence Interval for the Process Capability Index
83
P
P
S V2(n -1) 2^ +V2(n -1)
S V2(n -1)
2ran +V2(n -1)
1/2
<g <
-1/2
1
<— < G
S V2(n -1) 2 f*/2 +V 2( n -1)
S V2( n -1)
2 t1_a/2 +V 2( n -1)
1/2 A
/
-1/2 \
= 1 -a
= 1 -a
P
USL - LSL
P
USL - LSL
' S V2(n -1)
2^*/2 + V2(n -1)
S V2(n -1) '
2a + V2(n -1),
-1/2
USL - LSL USL - LSL
<-<--
6g	6
S 22( n -1)
2 Ca/2 +V 2( n -1)
-1/2 \
< C <
USL - LSL
S V2(n -1)
,2Ca/2 W2(n - 1)
-1/2 A
= 1 -a
= 1 -a.
Therefore, the confidence interval for the Cp based on the bootstrap-^ confidence interval for the standard deviation is given by
CI 2 =
USL - LSL
S V2(n -1) 2a +yj2(n -1)
-1/2
USL - LSL
S V2(n -1) 2C*-a/2 W2(n -1)
-1/2 \
(8)
All confidence intervals were implemented using the open source statistical package R (Ihaka and Gentleman, 1996); source code is available in Appendix.
4 Simulation study
To assess the performance of the proposed confidence interval, we conducted a
Monte Carlo simulation study to estimate the coverage probabilities and expected
lengths of the proposed confidence interval under different situations and compare
them with the existing confidence intervals. The estimated coverage probability and
the expected length (based on M replicates) are given by
#(L < Cp < U)
1 -a =---,
M
and
M
Z (Uj - Lj)
Length = --,
M
where #(L < Cp < U) denotes the number of simulation runs for which the true process capability index Cp lies within the confidence interval. The right-skewed data were generated with the population mean ju = 50 and the population standard deviation u = 1 given in the Table 1.
84
Wararit Panichkitkosolkul
Table 1: Probability distributions generated and the coefficient of skewness for
Monte Carlo simulation.
Probability Distributions	Coefficient of Skewness
N (50,1)	0.000
Uniform(48.268, 51.732)	0.000
10 x Beta(4.4375,13.3125) + 47.5	0.506
Gamma(9,3) + 47	0.667
Gamma (4,2) + 48	1.000
Gamma(2.25,1.5) + 48.5	1.333
Gamma(1,1) + 49	2.000
Gamma(0.75,0.867) + 49.1340	2.309
Gamma(0.5,0.707) + 49.2929	2.828
Gamma(0.4,0.6325) + 49.3675	3.163
Gamma(0.3,0.5477) + 49.4523	3.651
Gamma(0.25,0.5) + 49.5	4.000
The true values of the process capability index Cp, LSL and USL are set in the Table 2.
Table 2: True values of C , LSL and USL used for Monte Carlo simulation.
True Values of Cp	LSL	USL
1.00	47.00	53.00
1.33	46.01	53.99
1.50	45.50	54.50
1.67	44.99	55.01
2.00	44.00	56.00
The sample sizes simulated were 10, 25, 50 and 100 and the number of simulation trials was set to 10,000. The number of bootstrap samples is 1,000. The nominal confidence level was fixed at 0.95. All simulations were performed using programs written in the open source statistical package R (Ihaka and Gentleman, 1996).
The simulation results are presented for four cases. As can be seen from Figures 1 and 2, the existing confidence interval (CI 1) provides more estimated coverage probabilities than the proposed confidence interval (CI2) when the data were generated from symmetrical and less skewed distributions (coefficient of skewness between 0 and 2) for all sample sizes. Namely, CI1 provides estimated coverage probabilities close to the nominal level 0.95, which is more than those of the CI2 for the normal distribution. In addition, the expected lengths of CI2 were shorter than those of CI1 for all sample sizes (see Figures 5 and 6).
Confidence Interval for the Process Capability Index
85
On the other hand, for more skewed distributions (coefficient of skewness between 2.309 and 4), the estimated coverage probabilities of CI2 were greater than those of CIj for almost all sample sizes as shown in Figures 3 and 4. Figures 7 and 8 present the results on the expected lengths of CIJ and CI2 in case of right-skewed distributions. We found that the expected lengths of CIJ were shorter than those of CI2 for all sample sizes.
n = 10
n = 25
~i-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0 Cp
1-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0 Cp
n = 50
n = 100
1.0 1.2 1.4 1.6 1.8 2.0
Cp
1.0 1.2 1.4 1.6 1.8 2.0
Cp
Figure 1: The estimated coverage probabilities of CI1 and CI2 for C in case of N(50,1)
86
Wararit Panichkitkosolkul
n = 10
n = 25
n-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0
n-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0
Cp
Cp
n = 50
n = 100
1.0 1.2 1.4 1.6 1.8 2.0
i-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0
Cp
Cp
Figure 2: The estimated coverage probabilities of CI1 and CI2 for Cp in case of Gamma(4,2) + 48
n = 10
n = 25
1-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0 Cp
1.0 1.2 1.4 1.6 1.8 2.0 Cp
n = 50
n = 100
n-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0
n-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0
Cp
Cp
Figure 3: The estimated coverage probabilities of CI1 and CI2 for Cp in case of Gamma(0.75,0.867) + 49.1340
Confidence Interval for the Process Capability Index
87
n = 10
n = 25
n-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0 Cp
1.0 1.2 1.4 1.6 1.8 2.0 Cp
n = 50
n = 100
P ID £ h
o
O o
i-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0 Cp
n-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0 Cp
Figure 4: The estimated coverage probabilities of CI1 and CI2 for Cp in case of
Gamma(0.25,0.5) + 49.5
n = 10
n = 25
1.0 1.2 1.4 1.6 1.8 2.0
i-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0
Cp
Cp
n = 50
n = 100
n-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0 Cp
i-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0 Cp
Figure 5: The expected lengths of CI1 and CI2 for C in case of ^(50,1)
88
Wararit Panichkitkosolkul
n = 10
n-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0
Cp
n = 25
1.0 1.2 1.4 1.6 1.8 2.0
Cp
n = 50
n = 100
1.0 1.2 1.4 1.6 1.8 2.0
Cp
1.0 1.2 1.4 1.6 1.8 2.0
Cp
Figure 6: The expected lengths of CI1 and CI2 for C in case of Gamma(4,2) + 48
n = 10
n = 25
n-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0
n-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0
Cp
Cp
n = 50
n = 100
i-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0 Cp
1.0 1.2 1.4 1.6 1.8 2.0
Cp
Figure 7: The expected lengths of CI1 and CI2 for Cp in case of Gamma(0.75,0.867) + 49.1340
Confidence Interval for the Process Capability Index
89
n = 10
n = 25
n-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0 Cp
n-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0 Cp
n = 50
n = 100
i-r
1.0 1
l-1-1-r
2 1.4 1.6 1.8 2.0 Cp
n-1-1-1-1-r
1.0 1.2 1.4 1.6 1.8 2.0 Cp
Figure 8: The expected lengths of CI1 and CI2 for C in case of Gamma(0.25,0.5) + 49.5
5 Conclusions
The existing confidence interval for the capability index Cp based on the
confidence interval for the standard deviation was based on a normal distribution. However, the underlying distribution may be non-normal or skewed in some circumstances. A confidence interval for the capability index Cp based on the
bootstrap-t confidence interval for the standard deviation was developed. The proposed confidence intervals were compared with the existing confidence interval through a Monte Carlo simulation study. The proposed confidence interval proved to be better than the existing confidence interval in terms of the coverage probability when the data have a coefficient of skewness > 2. On the other hand, when the data are symmetrical or have a coefficient of skewness < 2, the estimated coverage probability of the existing confidence interval can be close to the nominal level.
90
Wararit Panichkitkosolkul
Appendix: Source R code for all confidence intervals
CI1 <- function (x,LSL,USL,alpha) {
n <- length(x) S <- sd(x)
chisql <- qchisq(alpha/2,df=n-1) chisq2 <- qchisq(1-alpha/2,df=n-1) K <- (USL-LSL)/(6*S) ci.low <- K*sqrt(chisq1/(n-1)) ci.up <- K*sqrt(chisq2/(n-1)) out <- cbind(ci.low,ci.up) return(out)
}
CI2 <-function (x,LSL,USL,alpha) {
n <- length(x) s2 <- var(x)
percentile.T.S <- percentile.T.star(x,alpha) T1 <- percentile.T.S[1] T2 <- percentile.T.S[2] K1 <- (USL-LSL)/6 K2 <- s2*sqrt(2*(n-1))
ci.low <- K1*(K2/(2*T1+sqrt(2*(n-1))))A(-1/2) ci.up <- K1*(K2/(2*T2+sqrt(2*(n-1))))A(-1/2) out <- cbind(ci.low,ci.up) return(out)
}
percentile.T.star <-function (x,alpha) {
B <- 1000 n <- length(x) S2 <- var(x) T.star <- numeric(B) for (i in 1:B){
xs <- sample(x,n,replace=TRUE) s2.star <- var(xs)
T.star[i] <- sqrt((n-1)/2)*((s2.star/S2)-1)
}
T1 <- quantile(T.star,probs=alpha) T2 <- quantile(T.star,probs=1-alpha) out <- cbind(T1,T2) return(out)
}
Confidence Interval for the Process Capability Index
91
Acknowledgements
The author would like to thank the anonymous referees for their helpful comments, which resulted in an improved paper. The author is also thankful for the support in the form of the research funds awarded by Thammasat University.
References
[1]	Bittanti, S., Lovera, M. and Moiraghi, L. (1998): Application of non-normal process capability indices to semiconductor quality control. IEEE Transactions on Semiconductor Manufacturing, 11, 296-303.
[2]	Casella, G. and Berger, R.L. (2001): Statistical Inference. Duxbury Press: Pacific Grove.
[3]	Chen, K.S. and Pearn, W.L. (1997): An application of non-normal process capability indices. Quality and Reliability Engineering International, 13, 335360.
[4]	Cojbasic, V. and Tomovic, A. (2007): Nonparametric confidence intervals for population variances of one sample and the difference of variances of two samples. Computational Statistics & Data Analysis, 51, 5562-5578.
[5]	Ding, J. (2004): A model of estimating process capability index from the first four moments of non-normal data. Quality and Reliability Engineering International, 20, 787-805.
[6]	Efron, B. (1979): Bootstrap methods: Another look at the jackknife. Annals of Statistics, 7, 1-26.
[7]	Efron, B. and Tibshirani, R.J. (1993): An Introduction to the Bootstrap. Chapman & Hall: New York.
[8]	Ihaka, R. and Gentleman, R. (1996): R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics, 5, 299-314.
[9]	Kane, V.E. (1986): Process Capability Indices. Journal of Quality Technology, 18, 41-52.
[10]	Kotz, S. and Johnson, N.L. (1993): Process Capability Indices. London: Chapman & Hall.
[11]	Kotz, S. and Lovelace, C.R. (1998): Process Capability Indices in Theory and Practice. Arnold: London.
[12]	Pearn, W.L. and Kotz, S. (2006): Encyclopedia and Handbook of Process Capability Indices: A Comprehensive Exposition of Quality Control Measures. Singapore: World Scientific.
[13]	Smithson, M. (2001): Correct confidence intervals for various regression effect sizes and parameters: the importance of noncentral distributions in computing intervals. Educational and Psychological Measurement, 61, 605632.
92
Wararit Panichkitkosolkul
[14]	Steiger, J.H. (2004): Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis. Psychological Methods, 9, 164-182.
[15]	Thompson, B. (2002): What future quantitative social science research could look like: confidence intervals for effect sizes. Educational Researcher, 31, 25-32.
[16]	Tosasukul, J., Budsaba, K. and Volodin, A. (2009): Dependent bootstrap confidence intervals for a population mean. Thailand Statistician, 7, 43-51.
[17]	Wu, H.-H., Swain, J.J., Farrington, P.A., and Messimer, S.L. (1999): A weighted variance capability index for general non-normal processes. Quality and Reliability Engineering International, 15, 397-402.
[18]	Zhang, J. (2010): Conditional confidence intervals of process capability indices following rejection of preliminary tests. Ph.D. Thesis, The University of Texas at Arlington, USA.