Metodološki zvezki, Vol. 3, No. 1, 2006, 63-74
Improved Parameter Estimation in Rayleigh
Model
Smail Mahdi1
Abstract
In this paper we describe and present results on the parameter point estimation for the scale and threshold parameters of the Rayleigh distribution. Five estimating methods have been investigated, namely, the maximum likelihood, the method of moment, the probability weighted moments method, the least square method and the least absolute deviation method. Modified maximum likelihood estimators for the parameters are also proposed. Simulation studies have shown that the modified likelihood estimator outperforms the estimators obtained with the other methods except in the case of very small samples.
1    Introduction
The two-parameter Rayleigh distribution is a continuous probability distribution which usually arises when a two dimensional vector has its two orthogonal components normally and independently distributed. The Euclidean norm of the vector will then have a Rayleigh distribution. The distribution may also arise in the case of random complex numbers whose real and imaginary components are normally and independently distributed. The modulus of these numbers will then be Rayleigh distributed. The Rayleigh variable X with threshold parameter e and scale parameter d is   characterized by the cumulative function
(x-e)2
F(x) =1- e-2d2      for e£ x < ¥ and d> 0 .                 (1.1)
This distribution plays an important in real life applications since it relates to a large number of distributions such as generalized extreme value, Weibull, chi-square and rice distributions. In this paper we investigate the estimation of the scale and threshold parameters using a modified maximum likelihood method (ML),  the moment  method  (MM),  the  probability weighted  moment  method
1 Department of Computer Science, Mathematics and Physics, University of the West Indies, Cave Hill Campus, Barbados.
64
Smail Mahdi
(PWM), the ordinary least square method (OLS) and the least absolute deviation (LAD) method. The PWM method is a relatively recent technique which is well used by hydrologists in frequency analysis. The method is strongly advocated in Hosking et al. (1985) and according to Davison and Smith (1990) it constitutes the most serious competitor to the ML method, especially, in the case of small samples. The performance of this technique has been recently investigated in Mahdi and Ashkar (2004) and Ashkar and Mahdi (2003) in Weibull and Log-logistic models, respectively. We organise this paper as follows. In Section 1, we have introduced the problem and in Section 2, we derive the parameter estimators using the considered methods and also give results on the asymptotic variances. Simulations results are discussed and illustrated in Section 3.
2    Estimation methods
We derive and present below estimators for the parameters e and d by using the five considered methods. We start with the probability weighted moments method.
2.1     Probability weighted moments
The probability weighted moment of order (i,j,k) is obtained from the inverse cumulative function   x(F) = e + dJ-2ln(1-F) as
Mij ,k =E (Xi [ F ( x )] [ 1-F ( x )] k) =  1 ( x ( F )) Fj ( -F ) k dF .             (2.1)
We use the usual orders i=1 and j=0 since this leads to a class of linear L-moments, see, Hosking (1986; 1990), with asymptotic normality. We denote ar the corresponding probability weighted momentM10 . After integration and simplification, we obtain
a =
e + dl    p
2(r +1)
/(r + 1) .                                         (2.2)
Substituting   two   distinct   orders   r   and   s   into   equation   (2.2)   gives   the probability weighted moment estimate for d as
Ù =   (r + 1)a ˆr-(s + 1)a ˆs                                                 (23)
I     p       I     p
"\|2(r + 1) -"\| 2(s + 1)
Improved Parameter Estimation in Rayleigh Model
65
where
aˆ
C
n-i x ri
= n-1? Cr n-x1i   for   r = 0,1,K,n -1, with the convention Ckj = 0 if k< j ,
i=1
C
is the unbiased estimators for ar , see Hosking (1989).  Thus, the estimator for   e is given by
e = (r + 1)ar-d
p     .
2(r +1)
(2.4)
2.1.1 Asymptotic variances of ˆ and d ˆ
The asymptotic variances of e ˆand d ˆare approximated by using the asymptotic variances of the PWM estimates ar . We will use result (5.3), provided in Hosking
(1986), stating that the vector whose rth component is jn(ar - ar), forr =0,1,...,m-1, has a Gaussian limiting distribution with mean vector 0 and covariance matrix A = ( Ars ) m-=10 where Ars =Irs+Isr and
I
(     ())r (     ())    ()(     ())
rs =        1-F x      1-F y   s F x 1-F x   dxdy.
??x<y
(2.5)
Using the approximation 26.2.10 in Abramowitz and Stegun (1970: 932), we get after integration and simplification,
Irs =2 2 J------( —
V s + 1\ 2
V 4r   \
p
4(r + 1)
21 +?¥ [n(2n+1)(n-1)]-1(-1)n
r+1                          2n+3              2n+1
n=1
2 2 (s + 1)
11 -
n+1         n+1
r
(r + 1)
(2.6)
The first order approximations for the variances and covariance of e ˆand d ˆ are obtained from the equation
Cov(aˆ,sˆ) = n-1GAGT
where the terms of the 2 by 2 matrix G, derived from the probability weighted moment equations of the form (2.3) and (2.4), are given by
G1 1 =-----= r +1,
(2.7)
66
Smail Mahdi
¶a
G12== 0,                                                                   (2.8)
and
G21= ¶--- =      |               |           ,                                       (2.9)
ar           p             p
A/ 2(r +1) - A/ 2(s + 1)
G 22= ¶d =-,     s+1,                                                 (210)
¶a                 p                 p
2(r + 1)    "V 2(s +1)
2.2     Maximum likelihood
Setting to zero the first derivative of the log-likelihood function with respect to  d gives the ML estimate for d for a given value e as
(2.11) The maximum likelihood estimate for the parameter e which is on    boundary
Ù
of the   distribution   support  is  given  by   e = min ( x1,x2,K,x ) . Note  that  this
estimator is biased since e is distributed as a Rayleigh variable with threshold
parameter  e   and scale parameter dV.     To prove that, let us denote G the
/ Vn
Ù
cumulative   function   of   e    which   is   based   on   the   distribution   random
Ù
sampleX1,K,X    The function G, evaluated ate = y, is given by
G(y) = P(min(X1, K,Xn)£y) = 1-P(min(X1,K,Xn)>y) = 1-{(P(X1fy)´K´P(Xnfy)} = 1-[1-F(y)] n.
Substituting now F(y) from equation (1.1), we get
(y-e 2
G(y)=1-e   2(d /   n )
Improved Parameter Estimation in Rayleigh Model                                              67
which has the same form as F(y). Therefore the Rayleigh variable  e  has the meane + —j=J;r2 . We propose then to use the modified maximum likelihood
estimators ~ and ~ that are solutions of the systems of equations

(2.12)
=
? (xi -e)2 ~
i =1                                                                                            (2.13)
2n
and which are based on the unbiased estimator ~ . Squaring equation (2.13) and expressing e as function of e from equation (2.12) yield the second order equation of ~ ,
(2------) ~2 +2J—  (e ˆ - x) ~ + 2xe - x 2 - e ˆ 2 = 0.                            (2.14)
2n             v 2n
The discriminant of the above equation is positive and is given by
D = 2(e-x)2 + (x2 -x2)(2------)³0.                             (2.15)
2n
Therefore,   equation   (2.14)   has   two   distinct   solutions.   Furthermore   this equation    admits    a   unique    positive    solution    since   the    roots    product   is
(e-x)2 +(x2 -x2) ----------------------------- 0.      Thus  the  modified   estimators   for  d   and   e   are
p         p
2-2n
respectively given by
(x -x(1))J— +J2(x(1) -x)2 + (x2 -x2)(2------)
2------
V 2n      v                                                2n --------------------------------------------------f 0                    (216)
2n
and
68
Smail Mahdi
x(1) -
p
n/2
(2.17)
where x and x2 are the first and second sample moments, respectively and x(1) is the first order statistic based on the random sample x1,K, xn .
2.2.1 Variance of dÙ ande~.
~ The asymptotic variance of dis approximately given by
Var(d ˆ) =
 ¶2ln(f)
-1
=
d2
n
4 + 3 e +       p
2
d2
d
(2.18)
obtained from the sample Fisher information on d. On the other hand, we have that Var ~ e) = Var(e) = (2-^2 ) d n by using the distribution ofe ˆ. Thus, ~ is a consistent estimator ofe.
2.3     Method of moments
The moment about the origin of order r is given by,
(x-e)   -(x-e)2 e      d2
mr = E(X r ) = ¥? xr  x -e e   2d2 dx .
After integration we get
v2
mr
V2
(2d ) k G[(k + 1)/2]
2d k=0 This can be simplified as
--------V Crk er k(l2d ) k G [( k + 1 ) /2 ] .
mr =YCkr - 1 er+1-k(j2d ) k -1 G[(k+1)/2].
k=1
(2.19)
(2.20)
(2.21)
Improved Parameter Estimation in Rayleigh Model
69
The first and second order non central moments can be evaluated from either equation (2.20) or (2.21). Using for instance equation (2.21), we get
m 1=2 e+2 G(3/2)d -e = e+dp —
and
m2 =e2 + 2V2 G(3/2) ed + 2d2 = e2 + 42p ed+ 2d2. This yields the following moment method estimators for d ande,
(2.22)
(2.23)
d =
nYß- Žxi
2 n 1- p
p
and
e=x-d—
(2.24)
(2.25)
where s
22
= yx  -x   i
s an estimator of the population standard deviation.
2.3.1 Asymptotic variances and covariance of ˆ and d ˆ
The asymptotic variances and covariance of e ˆ and d ˆ are estimated from the variances and covariance of the sample general moments m ˆr and m ˆs, see for instance, Mahdi and Ashkar (2004), as follows:
Var(e ˆ)		 M121	M2 12	2M11M12	-1
Var(d ˆ)	=	M 2	M2 22	2M 21 M 22	
Cov(e ˆ,d ˆ)		M 11 M 21	MM 12        22	M 11 M 22 +M 21 M 12_	
where
Var(m ˆr) =
Var(m ˆr)
Var(m ˆl ) Cov( m ˆ r , m ˆ l)
m 2r-( mr)2                    m 2 l-( ml)2 ;Cov(m ˆ r , m ˆ l)= m r + l-mrml
(2.26)
n
 Var(m ˆ l) =
n
n
M=   ¶     ;M2=   ¶d ;   M21
11       e      1                               ¶e
¶m l andM      ¶m l
22
d¶
In the considered case r = 1 and l = 2, we have
70
Smail Mahdi
Var(m ˆ2)
Var(m ˆ1)= (^rd , 2n
2(4 - p)d2e2 + 242p d3e + 4d4 n
=
,
and finally:
d 2 [(4 - p)e + d—] Cov( m ˆ1 ,m ˆ2) =-------------------------,
n
M11 = 1,
V 2 M21=2e+d 2 p
M22 = 4d + e2p .
(2.27) (2.28)
(2.29) (2.30)
(2.31)
(2.32)
(2.33)
2.4     Regression methods
The  parameters   dande   can  also be  estimated through the linear regression
technique   from   the   relationx = e + d]-2ln(1-F(x).   Ordinary   least   square
estimates as well as least absolute deviation   estimates for dande are
obtained   from   the   sample   points   {xi,yi}    where   xiis  the   ith   sample  value
corresponding to the empirical quantileF ˆ(x(i)) and yi =^-2ln(1-F(xi) . Ordinary
least square estimates for dande are obtained from the usual intercept and slope linear regression estimates, see, for instance Rice (1995). The least absolute deviation or median regression estimates of dande are obtained as solution to the
n
minimization problem:         Min = |yi-a-bxi|   with   respect  to   a   and   b.   The
solution is obtained by applying the simplex method to the linear programming problem:
-
Min?(ri+ + ri- ) under    the    constraints     yi - a -bxi - ri+ + ri- = 0 where     ri+     and
i=1
ri- are, respectively, the positive and negative residuals associated with the
observation xi ,i = 1,...,n.
Improved Parameter Estimation in Rayleigh Model                                              71
2.4.1 Variances of OLS estimators
The computation of the variances of the least absolute deviation estimators is extremely tedious. However, we can find the variances of the ordinary least square estimators under the assumptions of the standard statistical models, see, for instance Rice (1995). Let us denote dˆOls and eˆOls the OLS estimators of dande, respectively. The variances of these estimators are, respectively, given by
Var(d ˆ Ols) = -
ns
i=1
and
2nY,ln(1 - F(xi))+   //x/-2ln(1-F(xi)))
i=1
(2.34)
n
2s2 'S] ln(1 - F(xi)) Var( e ˆ Ols) =iii
2nS]ln(1-F(xi)) +  y^-J-2ln(1-F ˆ(xi)))
(2.35)
where
s 2 =Var(X) =dd 2                                      (236)
2
is obtained from equations (2.22) and (2.23).
3    Discussion
We have assessed the performance of the considered estimation methods through simulation studies. Different values of the parameters have been considered as well as different sample sizes. Orders (1,2), (1,3) and (2,3) are used in the PWM method. The sample points were generated using the inverse cumulative function technique. The probability weighted moments are estimated with the plotting method outlined in Hosking (1986), that is, M1,u,v is estimated by
M 1,u,v = n(-1) ? x(i) pi u (1- pi )v ˆ
i
where
2
2
72
Smail Mahdi
pi =
i+g n+d'
for d'>g>-1
We used the values g=- 0.35 and d'=0 which are recommended in Hosking (1986) for the study of the generalized extreme value distribution since the Rayleigh is well related to it. Several values ford,e and n were considered, namely,d= 2,4, 6, 8, 10 ; n = 10,20,30,40,50,60,70,80,90,100, and e=1, 3, 5, 7, 9 and 11. Small sample sizes from 1 to 9 were also considered and obtained corresponding results are displayed in Table 3. The root mean square errors (RMSE) for the estimates were then computed and used as performance index. Note that expressions for the asymptotic variances are also obtained whenever it is possible. These variances may be used, for instance, to compute approximate confidence bounds for the underlying parameters. First we have found that the variation of the evalue does not affect the RMSE results. One can then set, without loss of generality,e=1. However, the root mean square errors obtained with all methods increase as the value d increases, as illustrated in Table 1 below. On the other hand, the study has shown that the PWM orders 1 and 2 provide better RMSE results.
Table 1: RMSE of dande estimates obtained with the different methods combined by averaging the sample sizes n= 10, 20, 30, 40, 50, 60, 70, 80, 90 and 100 for various values of d. e=1 and PWM orders 1 and 2 are used.
d	2	4	6	8	10
dˆPWM	.2 3	.4 5	.68	.90	1.1 3
eˆPWM	.2 5	.5 0	.75	.99	1.2 5
dˆML	.2 0	.4 0	.61	.82	1.0 3
eˆML	.2 3	.4 7	.71	.95	1.1 9
dˆmm	.2 3	.4 8	.72	.95	1.2 0
eˆMM	.4 4	.8 8	1.3 1	1.7 5	2.2 0
dˆOLS	.2 3	.4 6	.69	.92	1.1 6
eˆOLS	.2 8	.5 6	.84	1.1 2	1.4 1
dˆLAD	.2 3	.4 6	.70	.93	1.1 7
eˆLAD	.2 8	.5 6	.85	1.1 4	1.4 2
Our investigation has also shown that the method of moments performs poorly in comparison to the other methods. Table 2, displayed below, gives the root mean
Improved Parameter Estimation in Rayleigh Model
73
square errors as functions of the sample size n. It shows that the root mean square values are monotonically decreasing with the sample size n. Overall, all methods have performed reasonably well except the method of moments. However, the modified maximum likelihood method provides better estimates ford, with any sample size, and for bothd andeparameters when the sample sizes are not small, say n=10, as illustrated in Tables 2 and 3. Note that in the case of small samples, the PWM method outperforms the maximum likelihood method for the estimation of e and performs almost as good as the maximum likelihood method for the estimation ofd, see, Table 3 results. Consequently, we recommend using the modified maximum likelihood method for the parameter estimation of the Rayleigh distribution in the case of non small samples. However, we notice that there is a gain in using the PWM method for estimating the Rayleigh threshold parameter when the sample size is small; this confirms Davison and Smith (1990) statement.
Table 2: RMSE of dande estimates, obtained with the different methods, combined by averaging over the values d = 2, 4, 6, 8 and 10 for various sample sizes n. e=1 and PWM orders 1 and 2 are used.
n	10	20	30	40	50	60	70	80	90	100
dˆPWM	1.34	.96	.78	.68	.61	.57	.52	.48	.4 6	.43
eˆPWM	1.45	1.05	.86	.75	.67	.61	.57	.53	.5 0	.46
dˆML	1.25	.87	.71	.61	.55	.49	.46	.43	.4 1	.39
eˆML	1.46	1.03	.82	.70	.62	.56	.52	.48	.4 5	.43
dˆmm	1.45	1.01	.82	.71	.64	.58	.54	.50	.4 7	.45
eˆMM	2.96	1.93	1.52	1.2 8	1.13	1.02	.93	.86	.8 1	.76
dˆOLS	1.37	.98	.80	.70	.63	.57	.53	.50	.4 7	.44
eˆOLS	1.70	1.19	.97	.84	.75	.69	.64	.60	.5 6	.53
dˆLAD	1.40	.99	.80	.70	.62	.57	.53	.49	.4 7	.44
eˆLAD	1.77	1.21	.98	.84	.75	.68	.63	.59	.5 6	.53
Acknowledgment
The financial support of UWI is gratefully acknowledged. The author is grateful to the editors and referees for their valuable help and suggestions that benefited and improved this article.
74
Smail Mahdi
Table 3: RMSE of dande estimates, obtained with the different methods, combined by
averaging over the values d = 2, 4, 6, 8 and 10 for small values      n=5, 6, 7, 8 and 9.
e=1 and PWM orders 1 and 2 are used.
6	7	8	9
1.7 0	1.5 8	1.4 9	1.4 1
1.8 2	1.7 0	1.6 1	1.5 3
1.6 6	1.5 2	1.4 2	1.3 3
2.1 4	1.9 4	1.7 9	1.6 6
1.9 0	1.7 5	1.6 3	1.5 3
4.1 3	3.7 3	3.4 2	3.1 7
1.7 7	1.6 3	1.5 4	1.4 4
2.2 4	2.0 6	1.9 1	1.8 0
1.8 3	1.6 7	1.5 8	1.4 8
2.3 6	2.1 2	2.0 3	1.8 9
Note: The numerical studies have been carried out with Gauss and SPSS, Release 11.
References
[1] Abramowitz, M. and Stegun, I. (1970): Handbook of Mathematical Functions. New York: Dover Publications, Inc.
[2] Ashkar, F. and Mahdi, S. (2003): Comparison of two fitting methods for the log-logistic distribution. Water Resources Research, 39, no. 8.
[3] Davison, A.C. and Smith, R.L. (1990): Models for exceedances over high threshold. J. R. Stat. Soc. B, 52 (3), 393-442.
[4] Hosking, J.R.M. (1986): The Theory of Probability Weighted Moments. New York. Research Report RC12210, IBM Thomas J. Watson Research Center.
[5] Hosking, J.R.M. (1990): L-Moments: analysis and estimation of distributions using linear combinations of order statistics. J. R. Stat. Soc. B, 52(1), 105-124.
[6] Mahdi, S. and Ashkar, F. (2004): Exploring Generalized Probability Weighted Moments, Generalized Moments and Maximum Likelihood Estimating Methods in Two-Parameter Weibull Model. Journ. of Hydrology, 285, 62-77.
[7] Rice, J.A. (1995): Mathematical Statistics and Data Analysis, 2nd Ed. Duxbury Press.
n	5
dˆPWM	1.8 5
eˆPWM	1.9 6
dˆML	1.8 4
eˆML	2.4 0
dˆmm	2.1 0
eˆMM	4.6 7
dˆOLS	1.9 4
eˆOLS	2.4 9
dˆLAD	2.0 3
eˆLAD	2.7 4