Metodoloski zvezki, Vol. 11, No. 1, 2014, 65-78
A Comparison of Methods for the Estimation of Weibull Distribution Parameters
Felix Noyanim Nwobi1 and Chukwudi Anderson Ugomma2
Abstract
In this paper we study the different methods for estimation of the parameters of the Weibull distribution. These methods are compared in terms of their fits using the mean square error (MSE) and the Kolmogorov-Smirnov (KS) criteria to select the best method. Goodness-of-fit tests show that the Weibull distribution is a good fit to the squared returns series of weekly stock prices of Cornerstone Insurance PLC. Results show that the mean rank (MR) is the best method among the methods in the graphical and analytical procedures. Numerical simulation studies carried out show that the maximum likelihood estimation method (MLE) significantly outperformed other methods.
1 Introduction
The Weibull Distribution has been widely studied since its introduction in 1951 by Professor Wallodi Weibull (Weibull, 1951). These studies range from parameter estimation; see for example, Mann et al. (1974), Johnson et al. (1994) and Al-Fawzan (2000) to diverse applications in reliability engineering especially in Tang (2004) and lifetime analysis in Lawless (1982, 2003). The popularity of the distribution is attributable to the fact that it provides a useful description for many different kinds of data, especially in emerging areas such as wind speed and finance (stock prices and actuarial data) in addition to its traditional engineering applications.
1	Department of Statistics, Imo State University, Owerri 460222, Nigeria. Email:
fnnwobi@imsu.edu.ng (corresponding author).
2	Department of Statistics, Imo State University, Owerri 460222, Nigeria. Email:
ugochukwu4 all@y ahoo. com
66
Felix Noyanim Nwobi and Chukwudi Anderson Ugomma
Engineers and statisticians relied mainly on probability plots, referred to as graphical procedure, to analyze life data prior to the advent of desktop computers and reliability analysis software became available. We discuss the three methods; the mean rank (MR), the median rank (MDR) and the symmetric cumulative distribution function (SCDF) in Section 2. Also in Section 2 we review three methods in the objective analytical procedure; the maximum likelihood estimation (MLE), the method of moments (MOM) and the least squares method (LSM). These methods are compared in Section 3, using the mean square error (MSE) and the maximum likelihood (LLH) criteria.
2 Methods for parameter estimation
Let S1,S2,"',SN be a random sample of size N from a population. Define
r = ln (sJst-1), r G(-¥, ¥ as returns of the stock prices (say), {s : St > 0} . Let xt = rt2 G R+ be hereinafter referred to as the squared returns.
2.1 The Weibull distribution
The general form of a three-parameter Weibull probability density function (pdf) is given by
f^x)=bixar) exp{"(xauf}, xu-0ab>0 (2.1)
where; xt is the data vector at time t; b is the shape parameter; ais the scale
parameter that indicates the spread of the distribution of sampled data and u is the location parameter. The Weibull probability density function satisfies the following properties:
a)	If 0 < b< 1, f is decreasing with f(x) ® ¥ as x® 0+.
b)	If b = 1, f is decreasing with f(x) ® 1 as x® 0+.
c)	If b>1, fat first increases and then decreases, with a maximum
value at the mode x =a(1 -1/b)1 b.
d)	For all b> 0, f (x) ® 0 as x®¥.
The cumulative distribution function (cdf) of the Weibull distribution is mathematically given as:
F (x ) = 1- exp J'-i iau).	(2.2)
Methods for Estimation of Weibull Distribution Parameters
67
In case of u = 0, the pdf in (2.1) reduces to (2.3)
f (* ) =
with a corresponding cdf as
a A a
\P-i
exp<
x.
a
F (* ) =
x
1 - exp -1-1 a,
*> 0;a,b> 0 otherwise
x > 0
(2.3)
(2.4)
0, otherwise
Cheng and Chen (1988) observed that the distribution interpolates between the exponential distribution (b = 1) and Raleigh distribution (b = 2). The mean and variance
of the Weibull distribution are E (X) = aT(1 +1/ b) and
V(X) = a2 [r(1 + 2/b)-G2 (1 +1/b)] respectively, where T(n) is a gamma function
evaluated at n.
b
0
b
2.2 Estimation procedures
2.2.1 Graphical procedure
If both sides of the cdf in (2.4) are transformed by ln(1 / (1 - x)), we get
ln
1
^ '*.Y
1 - F (*;)
a
so that
ln
ln
= bln -blna.
1 - F (Xi)
Here, x.i actually represents the order statistics x(1) < x(2) <... < x(n).
If we let Y = ln
(2.5)
ln(1/(1 -F(x;.))) , X = lnand c = -blna, then (2.5) represents a
simple linear regression function corresponding to
Y =bX + c.
The unbiased estimate of a, the scale parameter, is calculated as
a = exp
v bj.
(2.6)
(2.7)
1
68
Felix Noyanim Nwobi and Chukwudi Anderson Ugomma
where c is the intercept of the linear regression (2.6).
Thus, we perform the estimation of a and ¡5 using the following methods of estimation in Table 1.
Table 1: Methods of estimation by graphical procedure
Method	F (X)
Mean Rank	i/ (n +1)
Median Rank	(i - 0.3)/( n + 0.4)
Symmetric CDF	(i- 0.5)1 n
We plot Yi, which is a function of F(xt), versus Xi(= ln(xi)), using the following procedure:
a)	Rank the data {xt} in ascending order of magnitude;
b)	Estimate F (x) of the i th rank order; and
c)	Plot Yversus Xt.
This plot produces a straight line from which we obtain ¡5 and a (see (2.6) and (2.7)).
2.2.2 Analytical procedure
Maximum Likelihood Estimation (MLE)
The method of maximum likelihood estimation is a commonly used procedure for estimating parameters, see, e.g., Cohen (1965) and Harter and Moore (1965). Let x1, x2,..., xn be a random sample of size n drawn from a population with probability
density function f (x,1) where 1 = (¡,a) is an unknown vector of parameters, so that
the likelihood function is defined by
L = f (a,¡) = n f (X,1)	(2.8)
i=1
The maximum likelihood of 1 = (b,a), maximizes	L or equivalently, the logarithm of L when
— = 0,	(2.9)
dl	V 7
Methods for Estimation of Weibull Distribution Parameters
69
see, for example, Mood et al (1974). Consider the Weibull pdf given in equation (2.3), its likelihood function is given as:
l < «-..■ * ;A«)=n (a f 'exp
a
Wii ^ n
. a Jl a ,
Y (a,
Z exp
t=1
Taking the natural logarithm of both sides yields
ln L = n ln fb] + (b1) ¿X - ln (ab-1)-±
(2.10)
xt

(2.11)
t=1	t=1 v a y
and differentiating (2.11) partially w.r.t j3 and	a in turn and equating to zero, we obtain the estimating equations as follows
n
— ln L = — + Y ln x —1 i xf ln x = 0 dß ß t=1 f a 1=1 t f
and
d
n 1 ^
¿ln L = Y ¿ß= 0.
da	a a' t=1
From (2.13) we obtain an estimator of a as
1
a
mle
=z Y xß
n t=
and on substitution of (2.14) in (2.12) we obtain
1 +1 £ ln x -= 0 ß nt=1 t Y ;=! xß
(2.12)
(2.13)
(2.14)
(2.15)
which may be solved to obtain the estimate of b using Newton-Raphson method or any other numerical procedure because (2.15) does not have a closed form solution. When ftmle is obtained, the value of a follows from (2.14).
Method of Moments (MOM)
The second procedure we consider here is the MOM which is also commonly used in parameter estimation. Let x1,x2,...,xn represent a set of data for which we seek an unbiased estimator for the kth moment. Such an estimator is generally given by
ß
ß
70
Felix Noyanim Nwobi and Chukwudi Anderson Ugomma
m,.
=11 xk nT7
(2.16)
where mk is the estimate of kth moment. For the Weibull distribution given in
(2.3), the kth moment is given by
mk
^ I ^
1 + — . b
(2.17)
where r is as defined in subsection 2.1. From (2.17), we can find the 1st and 2 moments about zero as follows
nd
m =m

f
1 +1 . b
\
(2.18)
and
in1 = fi2 +(J2 =
1V
a.
r
1+— . b
-r
1+— . b
(2.19)
When we divide the square of rhlby m2, we get an expression which is a function of only b,
r

1+
__V
1 b
r
1+
1 b
a2 + a2
(2.20)
r
1 +
b
1 n	2
where // = E(Xt)=-^xf, a2 = E(Xf)-(E(Xt)) and letting Z = 1/b (2.19) is
n t=
easily transformed in order to estimate b so that the scale parameter
amom can be estimated with the following relation
^^mom = /V G
1 +1 . b
(2.21)
The Least Squares Method (LSM)
The Least Squares method is commonly applied in engineering and mathematics problems that are often not thought of as an estimation problem. We assume that there is a linear relationship between two variables. Assume a dataset that constitute a pair (xt, yt) =( x1, y1), (x2, y2),..., (xn, yn) were obtained and plotted. The least squares principle minimizes the vertical distance between the data points
2
2
2
Methods for Estimation of Weibull Distribution Parameters
71
and the straight line fitted to the data, the best fitting line to this data is the straight line: yt = a+fixt such that
Q (a; a, b) = X (yt-a-bxt )2
t=1
To obtain the estimators of a and b we differentiate Q w.r.t a and b. Equating to zero subsequently yields the following system of equations:
|2 ^ (y-ab )2
t=1
and
ff = -2±(yt -a-bxt)2xf = 0 db
t=1
Expanding and solving equations (2.21) and (2.22) simultaneously, we have
nZ xy - Z xZ y
b=
nZ x2-(Z x )2
and
c = y- b x ; a = exp
i c ^
b.
v h y
where a and b are the unbiased estimators of a and b respectively.
(2.22)
(2.23)
(2.24)
(2.25)
3 Method assessment and selection
3.1 Comparison of estimation methods
The Mean Squared Error (MSE) criterion is given by
1 -A r -	n2
MSE = - XI" F (x)-F ( a; )]	(3.1)
n i=1 L	J
where F() is obtained by substituting the estimates of a and b (for each method) in (2.4) while F(xt) = i/n is the empirical distribution function. The method with the minimum mean squared error (MSEmin ) becomes the best method for the estimation of Weibull parameters among the candidate methods.
72
Felix Noyanim Nwobi and Chukwudi Anderson Ugomma
3.2 Goodness-of-fit tests
Goodness-of-fit test procedures are intended to detect the existence of a significant difference between the observed (empirical) frequency of occurrence of an item and the theoretical (hypothesized) pattern of occurrence of that item. Here, we assume that the Weibull distribution is a good fit to the given dataset; otherwise, this assumption is nullified if, for this test, the computed statistic is greater or equal to a defined critical value.
Kolmogorov-Smirnov test
The Kolmogorov-Smirnov test is used to decide if a sample comes from a population with specific distribution. It is based upon a comparison between the empirical distribution function (ECDF) and the theoretical one defined as
F(x) = | f (y,6)dy where f (x,6) is the pdf of the Weibull distribution. Given n ordered data points Xj, X2,..., Xn, the ECDF is defined as F(X1 ) = N(i)/n where N(i) is the number of points less the Xt ( Xt are ordered from smallest to highest value). The test statistic used is
Dn = Sup F(x)-F(x) .	(3.2)
1<i<n
The statistic Dn converges to zero almost surely as n ® ¥.
4 Implementation
4.1 Data
The data used for this study is the weekly stock prices (N= 100 weeks) collected from Cornerstone Insurance Company PLC, a public liability company listed in the Nigerian Stock Exchange (Appendix I). The squared returns, r2, earlier defined in Section 2 are a measure of volatility in the stock prices and are multiplied by 100 without loss of generality. In Figure 1 we present a graphic relationship between the weekly stock prices and its squared returns. We perform the estimation of the parameters using the R software for the graphical and analytical procedures with 100 r2 as the dataset and r is now of length n. R is a language and environment for statistical computing and graphics (from the R Foundation for Statistical Computing (2013)) ran on the Platform: i386-w64-mingw32/i386 (32-bit).
Methods for Estimation of Weibull Distribution Parameters
73
CO -
Stock
SquareReturns*1 00
15
8 ®
* -
N -
0
—i— 20
40
60
80
—r 100
Time
Figure 1: Plot showing relationship between Weekly Stock Prices and its Squared
Returns*100
4.2 Simulation study
We carry out a numerical simulation study in order to investigate the behavior of the shape and scale parameters of the Weibull distribution. In the simulation experiment we set the Weibull distribution on the random variable X with shape parameter b = 0.54 with the aim of mimicking the squared returns (l00r2). For the Weibull
distribution on X, generate independently and identically distributed random sample (xi, x2,..., xn) of size n (= 25, 50, 75, 100, 125, 150, 175, 200). Compute the mean of
this sample and replicate this process N times to obtain a series. For each series of size n, estimate b and a using the methods described in Section 2, the MSE and the Kolmogorov-Smirnov (KS) statistic. This sequence is of the form X*...,N = mean(x*,...,a*) ,mean(x*,...,x*) ,...,mean(x*,...,x£) , N = 10000 times; and is
accomplished in R for Windows 2013 by the replicate function: replicate(N, mean (rweibull (n, shape = 0.54))).
We remark here that the least squares method (LSM) is related to the graphical procedure in the estimation of Weibull parameters through (2.6), where
Y = ln ln(1/(1 - F(x))) is dependent upon the particular graphical method (e.g., F(Xj) = i/(n +1) for the mean rank) and X = ln xt; see also equations (2.7) and (2.25).
74
Felix Noyanim Nwobi and Chukwudi Anderson Ugomma
4.3 Results and Discussion
All computations and simulations in this investigation were done in R version 3.0.0. We relied on the functions fitdist() and fitdistr() respectively from R packages fitdistrplus and MASS (see, e.g., Delignette-Muller et al (2013) and Ripley (2013) respectively) for maximum likelihood estimation of the parameters and plots while codes were developed for the other methods. Results for the graphical procedure (MR, MDR and SCDF) were verified using the approach in Dorner (1999) on Microsoft Excel 2013. The R code used for this study is available from the first author on request.
Estimates of the parameters based upon both the graphical and theoretical procedures described in Section 2.2 are presented in Table 2. The shape parameter b lies within the interval (0, 1) which implies, as indicated in Section 2.1, that the function (irrespective of the method) decreases exponentially. We ranked the performance of the methods based on the least MSE criterion. In comparison, the Mean Rank (MR) method
03
has the least MSE (3.88x10- ) and at the same time has the least Dn (0.0563) making it the best among the five methods under study (graphical and analytical procedures) for this particular dataset. The Maximum Likelihood Estimation (MLE) method is, however, superior to Method of Moments in the analytical procedure. From these results the best estimate for the shape and scale parameters are respectively (b,a) = (0.5325,0.4539)
based on our dataset.
The visual assessments of fit are shown in the histogram (Figure 2(a)) overlaid with the Weibull densities generated from the different methods and in the empirical cumulative distribution function plot of Figure 2(b). The MOM is clearly different from other methods given their MSEs but this difference is not very clear in Figure 2. However, simulation results show (Table 3) that the MLE performed best 86% of the time when the n simulations are run 10,000 times. Similar result was obtained when the KS goodness-of-fit test was conducted to test the adequacy of the Weibull distribution in fitting the simulation data.
Table 2: Summary of results and comparison of methods for Weibull parameter estimation
Procedure	Method	a	b	MSE	KS
	MR	0.4539	0.5325	3.88x10-03	0.0563
Graphical	MDR	0.4494	0.5452	4.21x10-03	0.0615
	SCDF	0.4461	0.5553	4.49x10-03	0.0656
Analytical	MLE	0.4563	0.5421	6.59x10-03	0.0617
	MOM	0.5244	0.6026	1.18x10-01	0.1055
Methods for Estimation of Weibull Distribution Parameters
75
Table 3 Simulation results (based on 10,000 iterations)
Method						
n	Measure	MR	MDR	SCDF	MLE	MOM
25	MSE KS	3.5726 0.0600	3.5815 0.0600	3.5837 0.0601	1.2557 0.0501	1.6770 0.9821
50	MSE KS	4.6281 0.0681	4.6323 0.0682	4.6282 0.0683	1.4930 0.0540	3.5122 0.9596
75	MSE KS	4.9234 0.0683	4.9502 0.0684	4.9407 0.0684	1.5438 0.0563	4.2108 0.9741
100	MSE KS	4.8839 0.0653	4.9119 0.0654	4.8985 0.0654	1.3216 0.0587	4.4869 0.0964
125	MSE KS	5.2496 0.0750	5.2389 0.0750	5.2598 0.0751	1.4261 0.0590	4.9398 0.9600
150	MSE KS	5.4266 0.0672	5.4118 0.0671	5.4341 0.0673	1.4671 0.0604	5.2043 0.9665
175	MSE KS	6.4067 0.0726	6.3872 0.0726	6.4096 0.0726	1.7235 0.0657	6.0586 0.9720
200	MSE KS	5.1548 0.0674	5.1831 0.0675	1.3525 0.0818	1.4170 0.0614	5.0833 0.9816
76
Felix Noyanim Nwobi and Chukwudi Anderson Ugomma
tn c CD "O
3
CN
CO CD
Ci3 CD
CD
CN CD
CD CD
- MLE
---MOM
..... MR
---- MDR
--SCDF
r 0
2
T~ 4
6
8
10
Square returns*100
ecdf(x)
f
oo
O
OD O
a
CM
a
o a
6
8
x
Figure 2: Fit of different methods (a) Density and Histogram (b) ECDF
Methods for Estimation of Weibull Distribution Parameters
77
5 Conclusion
The performances of five methods in the estimation of the parameters of the Weibull distribution were compared in this study. The MR was selected as the best method that gives the best estimates of the two-parameter model for square returns dataset, while the MLE is preferred over the MOM for the analytical procedure. These decisions were based on the minimum MSE criterion. When these methods were compared based upon simulation results, the maximum likelihood estimate method showed superiority over other methods. The least squares method (LSM), we remark, is also known as the rank regression method (RRM) because the estimation of the parameters of the Weibull distribution is dependent upon regressing some form of log and rank transformations of a given dataset according to the rank plotting position.
References
[1]	Al-Fawzan, M. (2000): Methods for Estimating the Parameters of
Weibull Distribution. King Abdulaziz City for Science and Technology, Saudi Arabia.
[2]	Cheng, S. K. and Chen, C. H. (1988): Estimation of the Weibull parameters
with grouped data. Communications in Statistics: Simulation and Computation, 11, 197-216
[3]	Cohen, A. C. (1965): Maximum Likelihood Estimation in the Weibull
Distribution Based on Complete and on Censored Samples, Technometrics, 7 (3).
[4]	Cornerstone Insurance Company Plc. www.conerstoneinsuranceplc.com Accessed 16th September, 2012.
[5]	Delignette-Muller, M. L. Pouillot, R. Denis, J. Dutang, C. (2013): R Package
fitdistrplus. http://www.cran.r-proj ect.org/package=fitdistrplus
[6]	Dorner, W. W. (1999): Using Microsoft Excel for Weibull Analysis.
www. qualitydi gest. com/j an99/html/weibull. html
[7]	Harter, H. L. and Moore, A. H. (1956): Maximum Likelihood Estimation of
the Parameters of Gamma and Weibull Populations from Complete and Censored samples. Technometrics, 7 (4)
[8]	Johnson, N. L. Kotz, S. and Balakrishnan, N. (1994): Maximum Likelihood
Estimation for Weibull Distribution. John Wiley & Sons, New York.
78
Felix Noyanim Nwobi and Chukwudi Anderson Ugomma
[9]	Lawless, J. F. (1982): Statistical Models for Lifetime Data. 2nd Edition, John
Wiley & Sons, New York.
[10]	Lawless, J. F. (2003): Statistical Models and Methods for Life time Data. 3rd Edition, John Wiley and Sons, New York.
[11]	Mann, N. R, Schafer, R. E. and Singpurwalla, N. D. (1974): Methods of
Statistical Analysis of Reliability and Life Data. John Wiley & Sons, New York.
[12]	Mood, A. M., Graybill, F. A. and Boes, C. D. (1974): Introduction to the
theory of Statistics. 3rd Edition, McGraw Hill, Kogkusha.
[13]	R Development Core Team (2013) http://www.r-proj ect. org
[14]	Ripley, B. (3013): R package MASS. http://www.cran.r- project.org/package
=MASS
[15]	Tang, Y. (2004): Extended Weibull Distributions in Reliability Engineering.
A Thesis Submitted to Department of Industrial & System Engineering, National University of Singapore.
[16]	Weibull, W. (1951): A Statistical Distribution of wide Applicability. Journal
of Applied Mechanics, 18, 239-296.
Appendix
Table A1: Weekly stock prices (read row-wise)
1.03	1.06	0.99	1.03	0.99	0.95	0.96	0.98	0.93	1.05
0.92	0.99	0.97	0.96	0.91	0.94	0.97	0.99	1.15	1.27
1.46	1.83	2.31	2.49	2.73	2.70	2.52	2.49	2.76	3.00
3.18	3.88	3.84	3.79	3.76	3.75	3.89	4.04	4.70	4.34
4.55	4.20	4.19	4.12	4.13	3.77	3.25	3.14	3.12	2.82
3.24	3.44	3.50	3.64	3.72	3.68	3.41	3.24	3.26	3.42
3.38	4.02	4.21	4.23	4.04	4.11	4.28	4.84	4.46	4.87
5.00	5.91	7.36	7.34	7.23	7.19	6.79	6.03	5.97	5.69
6.42	6.23	5.86	5.46	4.71	4.32	4.79	4.62	4.54	4.22
4.28	4.08	3.95	4.16	3.50	3.65	3.22	3.50	3.97	2.96