Metodoloski zvezki, Vol. 11, No. 1, 2014, 21-29 Optimal Unbiased Estimates of P{X < Y} for Some Families of Distributions Marko Obradovic, Milan Jovanovic, Bojana Milosevic1 Abstract In reliability theory, one of the main problems is estimating parameter R = P{X -geOxk-. Theorem 3 The UMVUE of R is R = J JI(x< y)f(x, y)dxdy, where f is given in theorem 2 for k = 1. 2 Existing results In this section we present a brief summary of existing results obtained for UMVUEs of R for some distributions. • Exponential distribution Let X and Y be independent exponentially distributed random variables with densities fX(x; a) = ae-ax, x > 0, fY(y; $)= fie-l3y,y > 0, where a and $ are unknown positive parameters. The complete sufficient statistics ni n-2 for a and $ are Tx = E Xj and Ty = E Yj. j=i j=i The UMVUE of R was derived by Tong (1974; 1977), and it is given by eWr(ni-ijr;su (ty)m, if ty < tx R = { i=° n^i(-i)i r(Sjrjn:-i) (ty) i, if ty >tx . i=0 v ' Optimal Unbiased Estimates of P{X < Y} 23 Normal distribution Let X and Y be normally distributed independent random variables with densities 1 (x —Mi)2 fx(x; ^i, ai) = ,-^e , x E R, fY (y; ^2,a2) = (y-M2) 2 nal e 2ct22 , y e R, where al, and al are unknown parameters. The complete sufficient statistics for O^i, al, m, al) are (X, SX, Y, Si). The UMVUE of R was derived by Downtown (1973) and it is given by R b 11 B( 1 2' 2 / \ 2 2 1 q n i—4 Q -/¿2—4 (1 - ul)(1 - vl)dudv, where B(a,b) is the beta function and n = i(u,v) E [-1,1] x [-1,1]|- uSx 1) +vsy ^1) +(Y - XX) > 0 v/nr v7^ Gamma distribution Let X and Y be independent gamma distributed random variables with densities fx(x; ai,ai) fY (y; al,al) xai-i x r(ai)aa: e CTi, x > 0, y«2-i e CT2 , y > 0, r(«l)a2a2 where ai and al are known integer values and ai and al are unknown positive ni parameters. The complete sufficient statistics for ai and al are Tx = E Xj and j=i «2 Ty = e Yj. j=i The UMVUE of R was derived by Constantine et al. (1986), and it is given by R (ra2-i)«2-i 1 - E k=0 B(ai+a2+k,(ni - i)ai) B(ai,(ni — i)ai)B(a2,(n2 — i)a2) \ «2+k y. A(n2-i)«2-i\ (-i)k (Tx\ 1 x V k J «2 +k \Ty J if Ty < Tx (ni- i)ai-i i- i- B(«2+"i+k,(n2-i)«2) X B(a2,(n2-i)a2)B(ai,(ni-i)ai) ai+k «i-M (-i)~ I Ty 1 k k=0 /(ni-i)ai-i\ (— ^ n (—/^tyAa ) ai+k yTx ) if Ty > T- x. 1 24 Marko Obradovic, Milan Jovanovic and Bojana Milosevic Gompertz distribution Let X and Y be independent Gompertz distributed random variables with densities -A1(ecx-1) /X(x; c, Ai) = Aiecxe c ,x> 0, -A2(ecy-1) /y(y;c, A2) = A2ecye c ,y> 0, where c is a known positive value and A1 and A2 are unknown positive parameters. The complete sufficient statisitcs for A1 and A2 are 1 «1 1 «2 Wx = - ^(ecX> - 1), Wy = - ^(ecyj - 1). C j=i C j=i The UMVUE of R was derived by Saracoglu et al. (2009) and it is given by 1 - zW P^+g-k) (WY, if Wx Wy. fc=0 Generalized Pareto distribution Let X and Y be independent random variables from generalized Pareto distribution with densities /x(x; ai, A) = aiA(1 + Ax)-(ai+1), x > 0, /y(y; a2,A) = a2A(1 + Ay)-(a2+1), y > 0, where A is known positive value and ai and a2 are unknown positive parameters. The complete sufficient statistics for parameters a1 and a2 are «1 «2 Tx = £ ln(1 + Xj) and Ty = £ ln(1 + Yj). j=i j=i The UMVUE of R was derived by Rezaei et al. (2010), and it is given by «2 — 1 / \ k 1 - Z-(-1)k r(nr1+k)rC2-k) TY , if Tx < Ty R = ( k=0 eW P(rar1(-n1)r(:2+k) (TY) k , if Tx > Ty. fc=0 Poisson distribution Let X and Y be independent Poisson distributed random variables with mass functions e-A1 A® P{X = x; Ai} =-¡-1, x = 0,1,..., x! e-A2 P{Y = y; A2} = , y = 0,1,..., where A1 and A2 are unknown positive parameters. The complete sufficient statistics for A1 «1 «2 and A2 are Tx = £ Xj and Ty = £ Yj. j=i j=i Optimal Unbiased Estimates of P{X < Y} 25 The UMVUE of R was derived by Belyaev and Lumelskii (1988) and it is given by R=......1CX) ^ t1 - £ ) ^ Negative binomial distribution Let X and Y be indepent random variables from negative binomial distributions with mass functions P{X = x;mi,pi} = - ^pf(1 -pi)m1, x = 0,1,..., P {Y = y; m2,p2} = ^ +y - ^ py2(1 - p2)m2, y = 0,1,..., where m1 and m2 are known integer values and pi and p2 are unknown probabilities. The ni n2 complete sufficient statistics for pi and p2 are Tx = Xj and Tv = Yj. j=i j=i The UMVUE of R was derived by Ivshin and Lumelskii (1995) and it is given by min{Tx,Ty-i} Ty (m1+x-i\(TX-x+m1(n1-i)-i\ (m2+y-i\(Ty-y+m,2(n2-i)-i\ n = V^ V^ I x A Tx-x ) V y A Ty-y ) R = 2-^t (m1«1+Tx-i\ (m2U2+Ty-i\ x=0 y=x+i V TX J V Ty J 3 New results In this section we shall derive the UMVUE of R for some new distributions. The first model is where stress and strength both have Weibull distribution with known but different shape parameter and unknown rate parameters. As a special case we present the model where stress has exponential and strength has Rayleigh distribution. An example with real data for Weibull model is presented. In the second model, both stress and strength have logarithmic distribution with unknown parameters. 3.1 Weibull model Let X and Y be independent random variables from Weibull distribution with densities fx(x; ai,ai) = ana?1 x21-ie-(a1x)a1, x > 0, fv(y; a2, a2) = «2a2a2ya2-ie-(CT2y)a2, y > 0. The Weibull distribution is one of the most used distribution in modeling life data. Many researchers have studied the reliability of Weibull model. Most of them did not consider unbiased estimators (e.g. Kundu and Gupta, 2006), and recently the case with common known shape parameter a has been studied in (Amiri et al., 2013). We consider the case where shape parameters ai and a2 are known positive values, while rate parameters ai and a2 are unknown positive parameters. 26 Marko Obradovic, Milan Jovanovic and Bojana Milosevic «1 The complete sufficient statistics for parameters ai and a2 are TX = E X®1 and j=i «2 TY = E Yj*2. Since Xai and Ya2 have exponential distributions with rate parameters j=i j a^1 and a^2, both statistics have gamma distribution, i.e. TX has r(n1, ) and TY has k r(n2,a-a2). Similarly, for k < min(ni,n2), Tx — E Xf has r(n - k,a-ai) and j=i k TY — E Y"2 has r(n2 — k,a-a2). Using this and transformation of random variables j=i «i (Xi,...,Xk , E Xf) to (Xi,...,Xk ,Tx ) we get, for ai = 1, j=k+i (tx — E xf )ni-k-1 , ai^ fc g(tx|Xi = xi,...,Xk = Xk) =-rj(ni — k)-e j=i I{tx xJ1 }. Using theorem 2, we get that k (tx — E xj1 )ni-k-ir(ni) k /(xi,...,xk) = ak n x"i-i-t«=-ir(n — k)-1 {tx > £ x? }. For k =1 we obtain that r«l)ni-2 (t _ x°l )r /(x) = ai(ni - 1)xai-1 ( X,- )wi-i I{tx > xai}. (tx )n Analogously we get that ^ (tv _ y/«2 )n2 2 f(y) = «2(^2 — 1)ya2-i ( Yy«2-i I{t- > ya2}. i i Denote M = min{t^^1, t-2 }. Using the independence of samples and the theorem 3, we obtain R = J J I{x < y}/(x)/(y)dxdy 0 0 i M ty2 = / "'"i " 1)("2 - x")n'-2dx j a«r-HY - y )-2dy 0 x 0 M f ai("i — 1)xai-i -itn2-i-(tX X tY t„i-it„2-i (tx — xai )ni-2(tY — xa2 )n2-idx. tt Optimal Unbiased Estimates of P{X < Y} 27 Now applying the binomial formula we obtain that the UMVUE of R is ao s ni — 2 n2 -1 R sp sp ( —1)r+sai(ni —1) /ni —2\/n2 —TXl.1 ' ' «i(r+1)+«2s V r / V s r=0 s=0 ni — 2 n2 — 1 ^i (r + i) v^ ( —1)r+s«i(ni —1) /ni — 2)(n2 — ^ TY ^ ^ ai(r+1)+a2s V r / V r=0 s=0 2 T r+i if Tai < Ty2 i i if tj^1 > T^2 (3.1) 3.1.1 Exponential-Rayleigh model As a special case of Weibull model we have a model where X has exponential and Y has Rayleigh distribution with densities fx(x; a) = ae—ax, x > 0, fy (y; /3) = 2P2ye 2ye—^2y2 y > 0, where a and 3 are unknown positive parameters. The complete sufficient statistics for a ni n2 and 3 are Tx = E Xj and Ty = E Y^2. j=1 j=1 The UMVUE of R is E2 1 (nir—2)(n2s—1)(TX)s, if TX < VT^ R r=0 s=0 ni —2 n2 —1 e (—;rr+)+i;1) (nir—2)(n2s—1)(^tty)r+1, if tx > vTy. r=0 s=0 3.1.2 Numerical example Here we present an example with real data. We wanted to compare daily wind speeds in Rotterdam and Eindhoven. We obtained two samples of 30 randomly chosen daily wind speeds (in 0.1 m/s) from the period of April 1st 2010 to April 1st 2014 taken from the website of Royal Netherlands Meteorological Institute. The first sample is from Rotterdam and the second one is from Eindhoven: Rotterdam (X): 48, 15, 27, 18, 40, 26, 84, 19, 35, 32, 55, 29, 45, 51, 47, 66, 38, 13, 39, 28, 50, 36, 15, 74, 53, 85, 18, 58, 18, 48. Eindhoven (Y): 44, 25, 43, 35, 20, 59, 25, 38, 26, 15, 37, 16, 35, 17, 34, 27, 40, 37, 33, 17, 51, 50, 33, 52, 25, 21, 34, 39, 23, 60. It is well known that wind speed follows Weibull distribution. To check this we used Kolmogorov-Smirnov test. Since this test requires that the parameters may not be estimated from the testing sample, we estimated them beforehand using some other larger samples from the same populations. We got that X follows Weibull distribution with shape parameter a = 2.8 and rate parameter a = 1/47 (Kolmogorov-Smirnov test statistics is 0.157 and the p-value is greater than 0.1), while Y follows Weibull distribution with shape parameter a = 2.6 and rate parameter a = 1/41 (Kolmogorov-Smirnov test statistics is 0.158 and the p-value is greater than 0.1). Finally, using (3.1) we estimated the probability that the daily wind speed is lower in Rotterdam than in Eindhoven to be f =0.32. i 28 Marko Obradovic, Milan Jovanovic and Bojana Milosevic 3.2 Logarithmic distribution Let X and Y be independent random variables from logarithmic distribution with mass functions — 1 P {X = x; p} = —-r-, x =1, 2,... ln(1 — p) x -1 qy P {Y== Rr—^ 7 •»=1-2- where p and q are unknown probabilities. The logarithmic distribution has application in biology and ecology. It is often used for modeling data linked to the number of species. ni « The complete sufficient statistics for p and q are Tx = Xj and Ty = Y. j=i j=i The sum of n independent random variables with logarithmic distributions with the same parameter p has Stirling distribution of the first kind SDFK(n,p) (Johnson et al., 2005), so Tx has SDFK(n1,p) and Ty has SDFK(n2, q) with the following mass functions P{Tx = x; ni,p} = x!(— ln(1 — p))«, x = ni,ni + 1,..., P {Ty = y; n2,q} = y!(— ln(1 — q))«, y = ^ + 1 — where s(x, n) is Stirling number of the first kind. An unbiased estimator for R is I{Xi < Yi}. Since E(I{Xi < Yi}|Tx = tx,Ty = ty) = P{X' ry = tY} P {Tx = tx, T Y = ty } M ty —«2 +i «1 «2 E E P{Xi = x}P{Yi = y}P{ E Xfc = tx — x}P{E Y = ty — y} x=i y=x+i fc=2 l=2 = P {Tx = tx }P {Ty = ty } = Mty—tx!ty!|s(tx — x,ni — 1)||s(ty — y,n — 1)| y=X+ nin2 (tx — x)!(tY — y)!xy|s(tx, ni) ||s(tY ,n2)|, where M = min{tx — ni + 1, tY — n2}, using theorem 1 we get that the UMVUE of R is min{Tx —ni + i,Ty—«2} TY—n2+i Tx!Ty!|s(Tx - x,ni - 1)||s(TY - y,n - 1)| x=1 y=x+1 n1n2(TX - x)!(TY - y)!xy|s(TX,n1)||s(TY,n2)| R = £ £ 4 Conclusion In this paper we considered the unbiased estimation of the probability P{X < Y} when X and Y are two independent random variables. Some known results of UMVUEs for R for some distributions were listed. Two new cases were presented, namely Weibull model with known but different shape parameters and unknown rate parameters and Logarithmic model with unknown parameters. An example using real data was provided. Optimal Unbiased Estimates of P{X < Y} 29 References [1] Amiri, N., Azimi, R., Yaghmaei, F. and Babanezhad, M. (2013): Estimation of Stress-Strength Parameter for Two-Parameter Weibull Distribution. International Journal of Advanced Statistics and Probability, 1(1), 4-8. [2] Belyaev, Y. and Lumelskii, Y. (1988): Multidimensional Poisson Walks. Journal of Mathematical Sciences, 40, 162-165. [3] Birnbaum, Z.W. (1956): On a Use of Mann-Whitney Statistics. Proc. Third Berkeley Symp. in Math. Statist. Probab., 1, 13-17. Berkeley, CA: University of California Press. [4] Constantine, K., Carson, M. and Tse, S-K. (1986): Estimators of P(Y < X) in the gamma case. Communications in Statistics - Simulations and Computations, 15, 365-388. [5] Downtown, F. (1973): On the Estimation of Pr(Y < X) in the Normal Case. Tech-nometrics, 15, 551-558. [6] Hogg, R.V., McKean, J.W. and Craig, A.T. (2005): Introduction to Mathematical Statistics, Sixth Edition. 348-349. New Jersey: Pearson Prentice Hall. [7] Ivshin, V.V. and Lumelskii, Ya.P. (1995): Statistical Estimation Problems in "Stress-Strength" Models. Perm, Russia: Perm University Press. [8] Johnson, N.L., Kemp, A.W. and Kotz, S. (2005): Univariate Discrete Distributions. New Jersey: John Wiley & Sons. [9] Kotz, S., Lumelskii, Y. and Pensky, M. (2003): The Stress-Strength Model and its Generalizations. Singapore: World Scientific Press. [10] Kundu, D. and Gupta, R.D. (2006): Estimation of P[Y < X] for Weibull Distributions. IEEE Transactions on Reliability, 55(2). [11] Rezaei, S., Tahmasbi, R. and Mahmoodi, M. (2010): Estimation of P[X < Y] for Generalized Pareto Distribution. Journal of Statistical Planning and Inference, 140, 480-494. [12] Saracoglu, B., Kaya, M.F. and Abd-Elfattah, A.M. (2009): Comparison of Estimators for Stress-Strength Reliability in the Gompertz Case. Hacettepe Journal of Mathematics and Statistics, 38(3), 339-349. [13] Tong, H. (1974): A Note on the Estimation of P(Y < X) in the Exponential Case. Technometrics, 16, 625. Errata: Technometrics, 17, 395. [14] Tong, H. (1977): On the Estimation of P(Y < X) for Exponential Families. IEEE Transactions on Reliability, 26, 54-56. [15] Website of Royal Netherlands Meteorological Institute, downloaded from http://www.knmi.nl on April 12 th 2014.