Metodolos¡ki zvezki, Vol. 2, No. 2, 2005, 243-257
Properties and Estimation of GARCH(1,1) Model
Petra Posedel1
Abstract
We study in depth the properties of the GARCH(1,1) model and the assumptions on the parameter space under which the process is stationary. In particular, we prove ergodicity and strong stationarity for the conditional variance (squared volatility) of the process. We show under which conditions higher order moments of the GARCH(1,1) process exist and conclude that GARCH processes are heavy-tailed. We investigate the sampling behavior of the quasi-maximum likelihood estimator of the Gaussian GARCH(1,1) model. A bounded conditional fourth moment of the rescaled variable (the ratio of the disturbance to the conditional standard deviation) is sufficient for the result. Consistent estimation and asymptotic normality are demonstrated, as well as consistent estimation of the asymptotic covariance matrix.
1    Introduction
Financial markets react nervously to political disorders, economic crises, wars or natural disasters. In such stress periods prices of financial assets tend to fluctuate very much. Statistically speaking, it means that the conditional variance for the given past
Var(Xt|Xt-1,Xt-2,...)
is not constant over time and the process Xt is conditionally heteroskedastic. Econome-tricians usually say that volatility
?t = v/Var(Xt|Xt-1,Xt-2,...)
changes over time. Understanding the nature of such time dependence is very important for many macroeconomic and financial applications, e.g. irreversible investments, option pricing, asset pricing etc. Models of conditional heteroskedasticity for time series have a very important role in today’s financial risk management and its attempts to make financial decisions on the basis of the observed price asset data Pt in discrete time. Prices Pt are believed to be nonstationary so they are usually transformed in the so-called log returns
Xt = log Pt- log Pt-1.
Log returns are supposed to be stationary, at least in periods of time that are not too long. Very often in the past it was suggested that (Xt) represents a sequence of independent identically distributed random variable, in other words, that log returns evolve
1 Faculty of Economics, University of Zagreb, Zagreb, Croatia
244
Petra Posedel
like a random walk. Samuelson suggested modelling speculative prices in the continuous time with the geometric Brownian motion. Discretization of that model leads to a random walk with independent identically distributed Gaussian increments of log return prices in discrete time. This hypothesis was rejected in the early sixties. Empirical studies based on the log return time series data of some US stocks showed the following observations, the so-called stylized facts of financial data:
serial dependence are present in the data
volatility changes over time
distribution of the data is heavy-tailed, asymmetric and therefore not Gaussian.
These observations clearly show that a random walk with Gaussian increments is not a very realistic model for financial data. It took some time before R. Engle found a discrete model that described very well the previously mentioned stylized facts of financial data, but it was also relatively simple and stationary so the inference was possible. Engle called his model autoregressive conditionally heteroskedastic- ARCH, because the conditional variance (squared volatility) is not constant over time and shows autoregressive structure. Some years later, T. Bollerslev generalized the model by introducing generalized autoregressive conditionally heteroskedastic - GARCH model. The properties of GARCH models are not easy to determine.
2     GARCH(1,1) process
Definition 2.1 Let (Zn) be a sequence ofi.i.d. random variables such that Zt ~ N(0,1). (Xt) is called the generalized autoregressive conditionally heteroskedastic or GARCH(q,p) process if
Xt = ?tZt,        teZ                                         (2.1)
where (?t) is a nonnegative process such that
?2t=?0 + ?1X2_1 + ... + ?qX2_q + ß1?t2_1 + ... + ßp?t2_p,    teZ          (2.2)
and
?0>0,    ?i>0    i = 1,...,q    ßi > 0    i=1,...,p.                 (2.3)
The conditions on parameters ensure strong positivity of the conditional variance (2.2). If we write the equation (2.2) in terms of the lag-operator B we get
?t2 = ?0 + ?(B)X2 + ß(B)?t2,                                   (2.4)
where
?(B) = ?1B + ?2B2 + ... + ?qBq
and
ß(B) = ß1B + ß2B2 + ... + ßpBp.                               (2.5)
Properties and Estimation of GARCH(1,1) Model
245
If the roots of the characteristic equation, i.e.
1 - ß1x - ß2x2 -...-ßpxp = 0 lie outside the unit circle and the process (Xt) is stationary, then we can write (2.2) as
2   =        ?0             ?(B)       2
?t          1-ß(1) + 1-ß(B)    t
oo
=   ?0 + /, ?iXt-i                                              (2.6)
i=1
where ?0 =------ß(1) , and ?i are coefficients of Bi in the expansion of ?(B)[1 - ß(B)]-1.
Note that the expression (2.6) tells us that the GARCH(q,p) process is an ARCH process of infinite order with a fractional structure of the coefficients.
From (2.1) it is obvious that the GARCH(1,1) process is stationary if the process (?t2) is stationary. So if we want to study the properties and higher order moments of GARCH(1,1) process it is enough to do so for the process (?t2).
The following theorem gives us the main result for stochastic difference equations that we are going to use in order to establish the stationarity of the process (?t2).
Theorem 2.2 Let (Yt) be the stochastic process defined by
Yt = At + BtYt-1,        teN,                                   (2.7)
or explicitly
t                    t                   t
Yt=Y0 n Bj+y, Am n bj, teN.                         (2.8)
j=1              m=1         j=m+1
Suppose that Y0 is independent of the i.i.d. sequence (At, Bt)t    . Assume that
Eln+\A\<oo    and     - oo < Eln \B\ < 0.                        (2.9)
Then
(a)  Yt -^ Y for some random variable Y such that it satisfies the identity in law
Y = A + BY,                                         (2.10)
where Y and (A, B) are independent.
(b)  Equation (2.10) has a solution, unique in distribution, which is given by
oo            m-1
Y=J^AmYlBj.                          (2.11)
m=1          j=1
The right hand side of (2.11) converges absolutely with probability 1.
246
Petra Posedel
(c)  If we choose Y0 = Y as in (2.11), then the process (Yt)t>0 is strictly stationary. Now assume the moment conditions
E\A\p <oo    and    E\B\p < 1    for some p E [1, oo).
(d)   Then E\Y\p < oo, and the series in (2.11) converges inpth mean.
(e)  IfE\Y0\p < oo, then (Yt) converges to Y in pth mean, and in particular
E\Yt\p -> E\Y\p    ast^oo.
(f)   The moments EYm are uniquely determined by the equations
EYm = Y, (k)E(BkAm-k)EYk,        m=1,...,[p\         (2.12)
k=0   ^     '
where \p\ denotes the floor function.
In the next theorem we present the stationarity of the conditional variance process (?t2).
Theorem 2.3 Let (?t2) be the conditional variance of GARCH( 1,1) process defined with (2.1) and (2.2). Additionally, assume that
E[ln(?1Z 2 + ß1)} <0                                      (2.13)
and that ?0 is independent from (Zt). Then it holds
(a)  the process (?t2) is strictly stationary if
oo    m-1
?0 = ?0 2v   I   I   (ß1 + ?1Zj2-1)                            (2.14)
m=1 j=1
and the series (2.14) converges absolutely with probability 1.
(b)  Assume that (?t2) is strictly stationary and let ? = ?0,  Z = Z1. Let E(ß1 + ?1Z 2^   < 1 for some p such integer m it holds
m-1
?1Z 2)p < 1 for some p G [1, oo). Then E(? 2)m < oo for some 1 < m < [p\.For
E[?2m]    =    [1-E(ß1 + ?1Z2)m}-1Y,[mk)E{?1Z2 + ß1)k?m0-kx
k=0   ^     '
xE?2k] <oo.
Proof: From (2.2) we have
?t2 = ?0 + ?1Xt2-1 + ß1?t2-1, or
< oo.                                                              (215)
? 1 Z2_1 +ß 1 ?2
?   = ?0+?1Z2-1+ß1?t-1
Properties and Estimation of GARCH(1,1) Model
247
that represents a stochastic difference equation
Yt = At + BtYt-1,
where Yt = ?t2, At = ?0 and Bt = ?1Z2-1 + ß1. From the assumptions of the theorem we have that E ln + \A\ < oo and Eln \B\ = E\ln (ß1+?1Zt-1)] < 0. So, from Theorem 2.2 we have that (?t) is strictly stationary with unique marginal distribution given by (2.14) and this shows the first statement of the theorem. Additionally, suppose that Eß1 + ?1Z 2)   < 1. In that case we have E\B\P = E ^ ß1 + ?1Z 2^   < 1 for some p G [1, oo) so
¦
from part (f) of Theorem 2.2 it follows (2
= E(ß 15).
Example 2.4 Let (Xt) be GARCH(1,1) process.Let
µ(?1,ß1,p) = E(?1Z 2 + ß1y,        pe[1,oo).
In that case, it follows from Theorem 2.3 that a necessary condition for the existence of the stationary moment of order 2m, 1 < m < p, of a GARCH(1,1) process is given by
µ(?1,ß1,p) <1.
In the special caseofm = 2 it follows that the stationary fourth moment of the GARCH( 1,1) process exists if
µ(?1,ß1,2) = J2        a3?1ß1T-J < 1,
i=0
that is equivalent to
ß12 + 2?1ß1 + 3?12 < 1.
From the recursive formula given in the Theorem 2.2 in the case of m = 1 and m = 2 we obtain
E(X 2) =E(Z 2) -E?t 2   =
and
1
?0 - ?1 - ß1
E X4
=   EZt
¦
E ?
=   3 ?2 + 2EX2?0(?1 + ß1)   •  1 - ß12 - 2?1ß1 - 3?2
=   3
?
+
=   3?2  1 +
1
2-
?
 •
2-
- ?1 - ß1 ?1+ß1
(?1+ß1)
"
- ß12 -
-
•   1 - ß12 - 2?1ß1 - 3?2 2?1ß1-3? 2]-1
-?1-ß1-1-ß12-
=   3?2(1 + ?1 + ß1) [(1 - ?1 - ß1)(1 - ß12 - 2?1ß1 - 3?
Since the marginal kurtosis is given by
k =
E(Xt4)
,
248
Petra Posedel
from the previous calculus it immediately follows that
k =
3(1+?1+ß1)(1-?1
- ?1 -ß1 )
 2
1 - ß12 - 2?1ß1 - 3?21 A little calculus shows
3Var(?t 2 )   =   E(X 4) - 3 [E(X 2)]2
=
3?20(1 + ?1+ß1)                                  ?0
(1 - ?1 - ß1)(1 - ß1 2 - 2?1ß1 - 3? 2 ) - 3   1 - ?1 - ß1
?0          · 7-----0----    1-----        .                         (2.16)
(1 - ?1 - ß1)2    (1 - ß12 - 2?1ß1 - 3?2)
Since from the assumptions we have that ?0 > 0, 1 - ?1 - ß1 > 0 and 1 - ß12 - 2?1ß1 -3?21 < 1, it follows that all the factors in (2.16) are positive so we conclude that the GARCH(1,1) process has the so-called leptokurtic distribution.
3    Estimation of the GARCH(1,1) model
Although in this section we assume that (Zt) are i.i.d. sequence of random variables, the results we shall present can also be shown for the (Zt) strictly stationary and ergodic sequence of random variables. In that case, the assumptions for the process (Zt) are little modified but the main part of the calculus we present here also holds for not such strong assumptions.
3.1    Description of the model and the quasi-likelihood function
Suppose we observe the sequence (Yt) such that
Yt = C0 + ?0t,        t=1,...,n, where we assume that (?0t) is GARCH(1,1) process, exactly
?0t = Zt?0t,        Ft = ?({?0s,s?t}),
where (Zt) is a sequence of i.i.d. random variables and
?
0t
= ?0(1 - ß0) + ?0?20t-1 + ß0?02t-1        a.s.                          (3.1)
-
From Theorem 2.2 we have that the strict stationary solution of (3.1) is given by
?
?20t = ?0 + ?0Y, ß0k?20t -1-k        a.s.
k=0
if it holds E [ ln (ß0 + ?0Z 2)] < 0. The process is described with the vector of parameters
?0= (C0, ?0, ?0, ß0).
Properties and Estimation of GARCH(1,1) Model
249
The model for the unknown parameters ? = (C, ?, ?, ß)' is given by
C,?,?,ß)' Yt = C + ?t,        t=1,...,n,
and
?t2(?) = ?(1 -ß) + ??2t-1 + ß?t2- 1(?),         t = 2,...,n
and with the initial condition ?12 (?) = ?. With that kind of notation we have the following expression for the process of conditional variance:
t-2
?2t=? + ? ^ ßk?2t-1-k.
k=0
Let us define the compact space
0   =    {? : Cl < C < Cd,0 < ?l < ? < ?d,0 < ?l < ? < ?d,
0<ßl<ß<ßd<1} C    {?: E[ln ß + ?Z 2)'] < 0}.
Additionally, assume that ?0 G @ so it immediately follows that ?0 > 0 and ß0 > 0. Inference for GARCH(1,1) process usually assumes that (Zt) are i.i.d. random variables such that Zt ~ N(0,1) so the likelihood function is easy to determine. Assuming that the likelihood function is Gaussian, the log-likelihood function is of the form (ignoring constants)
LT(?) = — ^  lt(?),     where    lt(?) = - I ln?t 2 (?) + -?t- ).
2T
t=1
Since the likelihood function does not need to be Gaussian, in other words, the process (Zt) does not need to be the Gaussian white noise, LT is called the quasi-likelihood function.
3.2    Consistency of the quasi-maximum likelihood estimator
Although a finite data set is available in practice, this is not enough to determine good properties of an estimator. We shall see in this section how useful results can be obtained taking into consideration the strictly stationary model for the conditional variance that we have previously defined. We shall note it in the following way
oo
?2ut(?) =? + ?Y,ßk?t-1-k,        ?t = Yt- C,
k=0
to avoid confusion with the original conditional variance process (?t2). In that case the quasi-likelihood function is given by
1
T
LuT (?) = — J2 lut (?),    where    lut (?) = -    ln ? 2 ut (?) + -?t—   .
2T
/                         ?2      \
250
Petra Posedel
Additionally, we are going to show that the stationary and the non-stationary model are not ”far away” in some sense. So, all the calculus is done using the stationary model and then connecting the two models.
Let us define
oo
??2t(?) = ? + ?Y,ßk?20t-1-k.
k=0
The process (?ut) is a strictly stationary model of the conditional variance which assumes an infinite history of the observed data. The process (??2t) is in fact identical to the process (?u2t) except that it is expressed as a function of the true innovations (?0t) instead of the residuals (?t).
We suppose that the following conditions on the process (Zt) hold:
(1)  (Zt) is a sequence of i.i.d. random variables such that EZt = 0;
(2)  Z2 is nondegenerate;
(3)  for some ? > 0 exists S ? < oo such that E [Z2+ ] < S ? < oo;
(4)  E[lnß0 + ?0Z 2)} < 0;
(5)  ?0 is in the interior of ?;
(6)  if for some t holds
oo                                                                oo
0t = c0 + Y, ck?t-k      i      ?0t = c0 + Y,ck?t-k k=1                                              k =1
then ci = c* for every 1 < i < oo.
We call the conditions (1) - (6) elementary conditions.
The proof for the following result for the case of the general GARCH(q, p) process can
be found in [5].
Proposition 3.3 If the elementary conditions hold, there are not two different vectors
(?, ?, ß, C) and (?, ?*, ß*, C*) such that
?,?,ß,Cand?*,?*,ß*,C* and
?0t = ?* + ?*(Yt-1 - C*)2 + ß*?20t-1
?20t = ? + ?Yt-1 - C    + ß?02t-1.
The following lemma would be very helpful for the results we shall provide.   The proof can be found in [10].
Lemma 3.4 Uniformly on ?
B-1??2t(?) < ?2ut(?) < B??2t(?)        a.s.
B = 1 + 2(1-ßd) -12 (Cd -Cl) xmax(—,1\ +        ?d       (Cd - Cl)2 .
Properties and Estimation of GARCH(1,1) Model
251
Although we are not going to discuss the rational moments of the process   ?0t, we will still mention that, under the elementary conditions, there exists 0 < p < 1 such that
E?0 2 t)p < oo.                                                        (3.2)
The proof for such a result can be found in [13], Theorem 4.
The following lemma gives us the basic properties of the process ?u 2 t) and the likelihood
function (lut).
Lemma 3.5 If the elementary conditions hold
(i) The process (c2 t(?)) is strictly stationary and ergodic;
(ii) The process (lut(?) ^ and the processes of its first and second derivatives with respect to ? are strictly stationary and ergodic for every ? in ?;
(iii) For some 0 < p < 1 and for every ? G ? it holds
E\?u2t(?)\p < Hp < oo.
Proof: The statement (1) follows from Theorem 2.3.
/                            ?2      \
Since lut(?) = - I ln ?u2t(?) + -^—   , and
= ( —2-----1 J—?u t--------2~7T,                                      (3.3)
-1 ??u2
-1 ut = 1 + /??*"1,                                         (3.7)
??            ?2ut                 ??    u2?t(?)
?lut           ?t2             ??2ut(?)     1
(
??          ?u2t               ??    ?u 2t(?)
?lut?t2             ??u2t(?)     1                 ?t
and where
=    —?t----1   —?ut------~77iT,                                (3.4)
\?? )     ?
?ß        ?2ut -1         ?ß    u2?t(?)
(
=       2   - 1        ?C        2 (?) - 227^V                         (3.5)
?lut      ( ?t2           ??2ut(?)     1
?C          ?u2t               ?C    ?2ut(?)        ?2ut(?)
= I —2-----1 J—?------~77iT,                                      (3.6)
??2                  ??
and
??                     ??
??2                       ??2
—— = ?t2 1 +ß    ut-1,                                           (3.8)
??                        ??
?C   = -2??t-1 + ß-utt1                                    (3.9)
??2                         ??2
ut = ?u2t-1 + ß—ut- ,                                          (3.10)
it follows that the process lut(?) and processes of its first and second derivatives are measurable functions of strictly stationary and ergodic process (?t) and so they are also strictly
252
Petra Posedel
stationary and ergodic. Finally, let 0 < p < 1 from (3.2). Then it follows from Lemma 3.4

k=0
oo
?p + ?pY,ßkpE{?20pt-k-1)   .
k=0
Since ?20t-1-k < ?-0 1?02 t for every k, using (3.2) it follows
[CO                                                      -i
?p + ?pJ]ßkp1p E(?02tp) k=0       ?0
?dp +    dpE ? 02tp) —-p    =Hp<oo.
Some nontrivial calculus give us the following result. Lemma 3.6 Under the elementary conditions it holds
sup|LuT(?)-LT(?)| —»0    a.s.    when    T ^ oo.
¦
?e?
LT(?)\
?
Finally, we want to find additional constraints for the expression —2- and its in-verse uniformly on ? G ?. We will do so by splitting the parameter space. Let IZl =
 r        1
TZ(K -1 ?l) < 1 where K(?) =        ?     < 1, for ? > 0, P = 1 -      2+g   2    G (0,1)
and S? define in the elementary conditions and Kl =-----------< 00. Let ?l and ?d be
positive constants such that

?l<ß01-nl12          and        ?d < ß01 - 7l012,
where T^0 = 7L(?0) < 1. For 1 < r < 122 define constants
subspaces
7^
ßrl = ß0Kj +?l<ß0        and        ßrd =    0   1 d > ß0,
@l = {?e?:ßrd<ß<ß0}        and        @rd = {? G ? : ß0 < ß < ßrd}
2We will need r to be 12 in Lemma 4.2. Our aim is to find the minimal r so that all the statements presented bellow hold for every ? G ?r.
Properties and Estimation of GARCH(1,1) Model
253
and ©r = ?rl U ?rd. The values ?l and ?d will depend on constants IZl and IZ0 which are functions of the parameter space ?.
Observe that we can choose ? = ?rmax C ?r, for all 1 < r < 12. Now we are able to present the result about the convergence in probability of the unconditional likelihood process.
Lemma 3.7 Under the elementary conditionsfor every ? G ?1 it holds:
G
(Cd - Cl)2
(1) e(?2 t(?-) <H1 = (Cd—Cl)- + BHc where Hc = — + —<oc.
?u2t(?)
In this case it holds
(2) LuT(?) - P   L(?) when T -> oo, where L(?) = e( -lut(?)
Proof: It is straightforward to show that
?
0t
g = C0 — C we have the following
/ (       +     \2\
??2 t(?)
<
Hc. Hence, using Lemma 3.4 and
so
/ ?2  \ \? utJ
=     E
=   BE
[?2 ^?      +2gE
I? 2 1  ?2
1
?
ut
<
2
<   B\\?\\1 +
+ g_ = BE
g2
(?0t\Ft-1) ?02 1       g2
 + e^- ? t
f ?2 \ ? ut J
<
BHc +

N        2
(Cd-Cl)
~?l
that proves the first statement. Additionally, we have
<   BHc + d ------ l- = H1
E\lut(?)\  = e

ut(?)) +
ln?u2 t(?)
?u2t(?)
<
E
ln?u2 t(?)
u2 t(?)|+E
?u2 t(?)
J.
r
254
Petra Posedel
But, for x > 1 and 0 < p < 1 it holds the inequality ln x <
>
E ln ?ut(?)
|    <    |ln?l|+E <    | ln ?l1      1
ln
p
~x p ,
so we have
+E 1

?l
-
since
?ult(?)
ln?l + 1 E?upt (?)
1. Finally, using Lemma 3.5 we have
E1lut(?)1 <oo. Since (lut(?)) is strictly stationary and ergodic, it follows
1     T LuT(?) = — J]lut(?)
1 el
ut(?)   =L(?),
V?e
?i.
¦ The convergence in probability that we have presented in Lemma 3.7 is not a sufficient condition for the consistency of the quasi-maximum likelihood estimator. It is necessary that the convergence we have previously obtained holds uniformly. In order to obtain that, it is sufficient to find an upper bound for the score vector of the log-likelihood function Vlut(?) uniformly on ?. The details regarding the explicit forms of the upper bounds can be found in [10].
Let
\A\=tr(AA')2        and        \\A\\r =   E\A\rr
be the Euclidean norm of a matrix or a vector and the Lr norm of a random matrix or a vector respectively.
Now we are going to present the local consistency of the quasi-maximum likelihood estimator. Let us define
?T = argmaxLT(?). ?ee3
is the parameter value that maximizes the likelihood function on the set ?3
C
?.
Theorem 3.8 Under the elementary conditions
when        T -> 00.
?T - P   ?o
Properties and Estimation of GARCH(1,1) Model
255
4    Asymptotic normality of the quasi-maximum likelihood estimator
In this section we present the asymptotic distribution of the quasi-maximum likelihood estimator (QMLE). In order to do so, we need stronger conditions on the process (Zt) than the elementary conditions we have given in the previous section. In fact, we pretend that the fourth moment of the random variable Zt is finite. We are going to call the following condition additional condition.
E(Z 4) </C<oo.
We do not present the proof for the following results as this would require long and non-trivial calculus.
Lemma 4.1 Under the elementary conditions and under additional condition it holds
(i) E\Vlut(?)Vlut(?)\ < oo, for every ? E ?12;
1      T (ii) —= Vvlt(?0) -^ N(0,A0), where A0 = E(Vlut(?0)Vlut(?0)'). VT t=1                                                         V
Let
BT(?) = - 1 J] V2lt(?)        and        B(?) = -EV2lut(?). t=1
Lemma 4.2 Suppose the elementary conditions and the additional condition to hold. Then
(i) E sup |V2l  (?)|<oo; ?e?12
\v2l ut (?)\
(ii) For i =1,2,3, 4,   E sup
?e?12
_?_
??
V2lut(?)
< oo, where ?i is the i-th element of ?;
(iii)   sup \Bt(?) — B(?)\ —> 0 and B (?) is a continuous function on ? 12.
?e?12
The following result presents one of the classical results in asymptotic analysis and it will be the basic tool for our further considerations. The details regarding the proof can be found in [9, p. 185].
Theorem 4.3 Let (Xt) be a sequence of random (m x n) matrices and let (Yt) be a sequence of random (nx1) vectors such that Xt —> C andYT —> Y ~ N(µ, ?) when T —> oo. Then the limiting distribution of(XTYT) is the same as that ofCY; that is
XTYT --> N(Cµ, C?C1)        when        T ^ oo. The following result assures that B0 is a regular matrix.
256
Petra Posedel
Lemma 4.4 Suppose that the joint distribution of (?t, ?2, ?u2t) is nondegenerate. Then for every ? G ? the matrix
r2        2           i
E is positive definite.
??2 t ??2 t   -4
?? ?? '
Finally, we have all the necessary results for studying the asymptotic behavior of the parameter estimator. In fact, using the results presented above, the following theorem can be proved.
Theorem 4.5 Suppose the elementary conditions and the additional condition to hold. Then
VT(?T-?0) -^ n(0,v0),
where V0 = B01A0B01, B0 = B(?0) = -E{X/2lut(?0)) and A0 is defined in Lemma 4.1.
Notice that A0 = -(EZ0 - 1)B0. So, in the case in which (Zt) is a sequence of random
variables such that Zt ~ N(0,1) we would have EZ0 - 1 = 2 and A0 = -B0. Let Bt = Bt(?^t). In the case of maximum likelihood estimator, Bt would be the standard estimator of the covariance matrix. But in a more general case of quasi-maximum likelihood estimator, the asymptotic covariance matrix is B01A0B01 according to Theorem 4.5. Since this is not equal to B01, Bt would not be a consistent estimator ofthat value. Let us define
AT(?) = 1j]vlt(?)Vlt(?)/
and
AT = AT ^ T)        and       A(?) = EVlut(?)Vlut(?)'.
The following result presents the consistency of the covariance matrix estimator.
Lemma 4.6 Suppose the elementary conditions and the additional condition to hold. Then
(i)   sup |AT(?) - A(?) | —> 0 and A(?) is continuous on ?12;
?e?12
\AT(?)-A(?)\ (ii) VT = Bt1AtBt1 - P   B0-1A0B0-1.
Lemma 4.6 completes our characterization of classical properties of the QMLE for GARCH(1, 1) model. We show that the covariance matrix estimator is consistent for the asymptotic variance of the parameter estimator.
Properties and Estimation of GARCH(1,1) Model
257
References
[1] Amemiya, T. (1985):  Advanced Econometrics. Cambridge:  Harvard University Press.
[2] Anderson, T.W. (1971): The Statistical Analysis of Time Series. New York: Wiley.
[3] Basrak, B., Davis R.A., and Mikosch, T. (2002): Regular variation of GARCH pro-cesses. Stochastic. Process. Appl., 99, 95-115.
[4] Basrak, B., Davis R.A., and Mikosch, T. (2002): A characterization of multivariate regular variation. Ann. Appl. Probab., 12, 908-920.
[5] Berkes, I., Horva´th, L., and Kokoszka, P. (2003): GARCH process: structure and estimation. Bernoulli, 9, 201-227.
[6] Billingsley, P. (1968): Convergence of Probability Measures. New York: Wiley.
[7] Billinglsley, P. (1979): Probability and Measure. New York: Wiley.
[8] Embrechts, P., Klu¨ppelberg, C., and Mikosch, T. (1997):   Modelling Extremal Events. Berlin: Springer-Verlag.
[9] Hamilton, J.D. (1994): Time Series Analysis. Princeton: Princeton University Press.
[10] Lee, S.W. and Hansen, B.E. (1994): Asymptotic theory for the GARCH(1,1) quasimaximum likelihood estimator. Econometric Theory, 10, 29-52.
[11] Mikosch, T. amd Sta¢rica¢, C. (1999): Change of structure in financial time series, long range dependence and the GARCH model. IWI Preprint, 5.
[12] Mikosch, T. and Straumann, D. (2003): Stable limits of martingale transforms with application to the estimation of GARCH parameters. Working Paper, no. 189.
[13] Nelson, D.B. (1990):   Stationarity and persistance in the GARCH(1,1) model. Econometric Theory, 6, 318-334.
[14] Straumann, D. (2003): Estimation in Conditionally Heteroscedastic Time Series Models, Ph.D. Thesis, University of Copenhagen.
[15] Williams, D. (1991): Probability with Martingales. Cambridge: Cambridge Univer-sity Press.