Metodološki zvezki, Vol. 1, No. 1, 2004, 99-108
The Distribution of the Ratio of Jointly Normal
Variables
Anton Cedilnik1, Katarina Košmelj2, and Andrej Blejec3
Abstract
We derive the probability density of the ratio of components of the bivariate normal distribution with arbitrary parameters. The density is a product of two factors, the first is a Cauchy density, the second a very complicated function. We show that the distribution under study does not possess an expected value or other moments of higher order. Our particular interest is focused on the shape of the density. We introduce a shape parameter and show that according to its sign the densities are classified into three main groups. As an example, we derive the distribution of the ratio Z = -Bm-1 /(mBm) for a polynomial regression of order m.  For m =1,
Z is the estimator for the zero of a linear regression, for m=2 , an estimator for the abscissa of the extreme of a quadratic regression, and for m =3, an estimator for the abscissa of the inflection point of a cubic regression.
1    Introduction
The ratio of two normally distributed random variables occurs frequently in statistical analysis. For example, in linear regression, E(Y | x) = b0 +b1 x, the value
x0 for which the expected response E(Y) has a given value y0 is often of interest. The estimator for x0 , the random variable X0 = (y0 - B0 )/ B1 , is under the standard regression assumption expressed as the ratio of two normally distributed and dependent random variables B0 and B1 , which are the estimators for b0 and b1 and whose distributions and dependence are known from regression theory.
1 Biotechnical Faculty, University of Ljubljana, Jamnikarjeva 101, 1000 Ljubljana, Slovenia; Anton.Cedilnik@bf.uni-lj.si
2 Biotechnical Faculty, University of Ljubljana Jamnikarjeva 101, 1000 Ljubljana, Slovenia; Katarina.Kosmelj@bf.uni-lj.si
3  National Institute of Biology, University of Ljubljana, Vecna pot 111, 1000 Ljubljana,
Andrej.Blejec@uni-lj.si Slovenia;
100                                         Anton Cedilnik, Katarina Košmelj, and Andrej Blejec
Similar to the  example  above is the  situation of a quadratic regression, E(Y | x) = b0 + b 1 x + b2 x2,  where the value  sought is the   x0   for which   E(Y)
reaches its extreme value. At this point, the first derivative must be zero. Hence,
X0 =-B1/2B2 is expressed as the ratio of two normally distributed and dependent
variables as well.
From the literature it is known that the distribution of the ratio Z = X/Y , when
X and Y are independent, is Cauchy. The probability density function for a Cauchy
 b variable U:C ( a,b ) is pU(x) = p ((------) , where the location parameter a is
the median, while the quartiles are obtained from the location parameter a and the positive scale parameter b, q13 = a m b. This density function pU(x) has ‘fat tails’,
hence U does not possess an expected value or moments of higher order (Johnson et al., 1994).
Some results about the ratio from the literature are:
(a) The ratio Z of two centred normal variables is a Cauchy variable (Jamnik, 1971: 149):
Xl                     s X,s Y,  r ¹±1 )      =*
Y
: N( mx = m Y = 0, s X, sY,
Z =      : C a = ps X ,b= s X   1-r 2 Y       {        sY        sY
The simplest case is the ratio of two independent standardised normal variables which is a ‘standard’ Cauchy variable C(0,1).
(b) The ratio Z of two non-centred independent normal variables is a particular Cauchy-like distribution. This result is shown in Kamerud (1978).
(c)  The ratio of two arbitrary normal variables is discussed in Marsaglia (1965) and leads again to a Cauchy-like distribution.
The case considered in (b) is not general and the result in the cited article is presented in a very implicit way. Marsaglia dealt with the ratio of two independent normal variables, having shown previously, however that any case could be transformed into this setting.
The objective of our work is to derive the probability density for the ratio of components of the bivariate normal distribution for a general setting. Let the vector W = [X Y]T : N(mx, m Y,sx > 0,s Y > 0, r) be distributed normally, with the density (for r ¹ ±1):
pW (x, y)=--------------1       × exp
1                                                        f       1
--------------.            ×exp-------------
2ps X sY-1-r2        \   2(1-r )
and with the expected value and the variance-covariance matrix
(x-m X)2    2r(x - m X)(y - mY)    (y-mY)2

The Distribution of the Ratio…
101
E(W) =
m X
,   var(W) =
s Y
.
m YJ             rs X s Y      s Y 2
Our aim is to express the density function of the ratio Z = X/Y explicitly, in terms of the parameters of the bivariate normal distribution. We shall also discuss the degenerate situation, r = ±1.
2
Probability density for the ratio
The following theorem is the basis for our derivation of the probability density for the ratio (Jamnik, 1971: 148).
Theorem 1. Let W=[X Y]T be a continuously distributed random vector with a probability density function pW(x,y). Then Z = X/Y is a continuously distributed random variable with the probability density function
pZ(z)=   \ypW(zy,y)dy =      -    \y pW(zy,y)dy .
0
(2.1)
For the derivation of pZ(z) for the ratio of the components of a bivariate normal vector we calculated the integral (2.1) using formulae in the Appendix. A long but straightforward calculation gives the next theorem.
Theorem 2. The probability density for Z = X/Y, where [X Y]T : N( m X,m Y, sX,s Y, r ¹ ±1) is expressed as a product of two terms:
pZ (z) =
s X sY-\1    r
p ( s Y 2 z 2-2 rs X s Yz + s X 2)
1                   2
exp-----sup R
v   2
R × F(R)
=
=
s X
s Y 1- r 2
p ( s Y 2z2 -2rs X sYz+s X 2)
exp----supR2 +yl2p ×R×F(R)×exp----[supR 2-R 2]
V 2         )                           \ 2
where:
R=R(z) =
(s Y 2 m X - rs X s Y m Y ) z -rs X s Y m X + s X 2 m Y
sY-1-r2 ×s Y 2z2-2rs X s Yz+s X 2
=
m X       m Y
-r      z-
v s X
Yj
m X    m Y
r
V    sX      sY)
-
s X
1- r2×
z2 -2r     z+

-
 2 mX
2 = s Y 2 m X - 2rs X s Y m X m Y + s X2xß2Y = vs Xy
up                                 _2   _2             2
s X2xa2Y(1-r2)
- 2 r        +
 2 mY
Y J
sYj
1-r2
(2.2)
(2.2a)
(2.2b)
×
2
,
102
Anton Cedilnik, Katarina Košmelj, and Andrej Blejec
supR2 - R2 =
(mX -mY z)2
sY2z2 -2rsXsY z +sX2
=
m
sX
XX -sX sY
mY
z2 -2r
sX
(2.2c)
z+
s Y J
The first factor in (2.2), the standard part, is the density for a non-centred
sX     sX
Cauchy variable, C
. We have to stress that this factor
is independent of the expected values m X and mY .
The second factor, the deviant part, is a complicated function of z, including also the  error function   F(.)   (in  Gauss  form;   see  Appendix).  We  need  four
and
, to fully describe the distribution. It is strictly
mX     mY parameters: r,        ,
sX     sY
positive and asymptotically constant – it has the same positive value for both
sYmX -rsXmY
sX sY
Y
z = ±¥, due to the fact that R(±¥) =

sXsY\1      r
. Therefore, the asymptotic
behaviour of pZ (z) is the same as that of the Cauchy density, so E(Z) and other moments do not exist.
We wrote the deviant part in (2.2) in two forms. The first form is nicer and can also be found in Marsaglia (1965), but the second form is better for numerical purposes.
A more detailed analysis of pZ (z) led us to the definition of the shape parameter w:
mY
w=
sY
V     X
based on R(±¥) and
sXs
m X -pm Y s Y j
,
mY z)
dR =
dzXay-2rs X sYz + s X 2 ) three different types of shape of pZ(z):
I.        w>0
II.       w < 0
III.      w = 0 which occurs in three variants:
a.    mY ¹ 0,
b.   m Y=0¹m X,
c.    mY=0 = m X.
3/ 2
(2.3)
. The sign of w separates
2
z
.
2
X
The Distribution of the Ratio…
103
The derivative of the deviant part led us to the definitions of two quantities for
mX
mX     sX
and   d =
mX     mY
r     -
sX     sY mX      mY
-r
sX        sY
. u is the abscissa of
m types I and II:   u =    X =    X mY      mY
sY
the local maximum and d the abscissa of local minimum of the deviant part. For
sX type I:   d < a < u , and for type II:   u < a < d ; as previously, a = r       , the centre
sY of the standard part (see Figure 1).
(X,Y) ~ N( 2 , 1 , 1 , 1 , 0 )
-4
-2


Type I
-4
-2
(X,Y) ~ N( -2 , 0.25 , 1 , 1 , 0.5 )
-10
-5
10


-10
Type II
-5
10
Figure 1: A case with a positive shape parameter (Type I) and with a negative shape
parameter (Type II). On the left, the standard Cauchy part (thick line) and the deviant
part (thin line) are presented; the functions are on different scales in order to depict the
shapes of both functions on one plot. The vertical dashed lines indicate the abscissas of
the local extremes of the deviant part, the horizontal dashed line is its asymptote. The
right plot presents the graph of the density pZ (z) .
0
0
2
4
0
2
4
0
0
5
0
5
104
Anton Cedilnik, Katarina Košmelj, and Andrej Blejec
(X,Y) ~ N( 1 , 1 , 4 , 2 , 0.5 )


Type IIIa
-4
-2
-4
-2
(X,Y) ~ N( 2 , 0 , 1 , 1 , 0.5 )


Type IIIb
-10
-5
10
-10
-5
10
(X,Y) ~ N( 0 , 0 , 2 , 1 , 0.5 )

Type IIIc
-10
-5
10
-10
-5
10
Figure 2: Three cases having zero value of the shape parameter (Type III). On the left, the standard Cauchy part (thick line) and the deviant part (thin line) are presented; the functions are on different scales in order to depict the shapes of both functions on one plot. The vertical dashed line indicates the abscissa of the local extreme of the deviant part, the horizontal dashed line is its asymptote. The right plot presents the graph of the
density pZ (z) .
0
0
2
4
0
2
4
0
0
5
0
5
0
0
5
0
5
The Distribution of the Ratio…                                                                           105
Type III describes the marginal case, not likely to occur in practice. In variant IIIa (resp. IIIb), the deviant part has only a maximum (resp. a minimum) at z = a. In variant IIIc, the deviant part is equal to constant 1 (see Figure 2).
The median M(Z) and mode(s) can not be obtained analytically for the general case; further numerical calculations have to be done for each particular
case. But we have derived some partial results. For type I :   M(Z)>ps X   , for
type II:    M(Z)<r       , for type III:   pZ(z)  is symmetric and    M(Z) = r-   .
Variants IIIa and IIIc are unimodal; generally, pZ(z) may be uni- or bimodal. The distribution function and quantiles require numerical integration.
3    Degenerate situation
Now, let us consider the case r = ±1, but still withs X>0,  sY > 0. Then, the distribution   of   W=[X Y]T   is   degenerate,   and  with   probability   1,   it  holds
-----Y = r ×   s m X   ; hence: Z=     = r- +              Y        . Since the marginal
aj               ax                           Y        sY               Y
distribution   Y : N(m Y, sY)   is the usual normal distribution, it is easy to find the probability density for Z from the following theorem.
Theorem 3.       If Y: N( mY, sY)   and  Z = a+     , c¹0, then Z has the density
Y
given by
pZ (z) =c— × (z - a)
×exp
2p                      \    2sY2 \_z-a
1
m Y
2
.
The function pZ(z) from this theorem is much simpler than (2) and it is rather easy to find its characteristics, including quantiles and distribution function. Also, there are two modes that can be found explicitly, and between them there is a removable singularity pZ(a) = 0. The expected value, as in non-degenerate cases, does not exist.
It is worth noting that in the degenerate case the shape parameter (3) is zero precisely when pZ(z) is symmetric, as in the non-degenerate case. According to the sign of the shape parameter, the relations between the median M(Z) and the
quantity a = ps X  remain the same, as well.
c
106                                        Anton Cedilnik, Katarina Košmelj, and Andrej Blejec
4    Examples
Now, let us discuss the two problems presented in the Introduction. First, we will consider a linear regression E(Y | x) = b0 + b 1 x. We shall be interested in the x-axis
intercept:   X0 =-B0/B1  , where B0 and B1 denote the estimators for b0 and b1.
Under the assumption that   Y|x: N(b0+b 1x, sreg) , the variable X0 is expressed
as the ratio of two normally distributed and dependent random variables -B0 and
B1.   Given   the   data   {(xi,yi), i = 1,...,n}       (x 1<xn),   we   denote:    x=Vxi ,
n
i      and    q =   ,----                      -----
n                            ^n(w2-x2)
w = J— Vxi2     and    q =   ,----reg           .    Then:
B1
: N b0, b 1, qw, q,----,
-B0  B1
: N -b0, b 1, qw, q,        . Hence, X0 has a distribution with density function
V                       w J
(2) on making the substitution:    m X ® -b0  , mY ® b 1 , s X ® q×s  ,  sY ®q , r ®x/w .
Now we shall be concerned with a general polynomial regression E(Y | x) = b0 + b 1 x +K + bmxm, m³1. Let us define Z = -Bm-1 /(mBm) . For m = 1, Z = X0 from the first example, the estimator for the zero of a linear regression.
For m = 2, Z is an estimator for the abscissa of the extreme of a quadratic regression, and for m = 3, Z is an estimator for the abscissa of the inflection point of a cubic regression.
Introduce the following two data matrices:    v =
i = 1,K,n k = 0,K,m
k       i      1,K,n
xi                                      , the
\k = 0,K,m)
^                        ' -in´(m+1)
matrix of powers of x-s , and Y = [ yi (i = 1,K ,n)] n ´1. The regularity condition, that there are at least m + 1 distinct x-s, implies that the rank of v is precisely m + 1. Hence,    vTv    is   invertible   and   d = (vT × v)-1 = d     j,k = 0,K,m)              .   Let
[    jk  (                          ] (m+1)´(m+1)
P = [b k (k = 0,K,m)]( +1)´1 be the column of the regression coefficients, and B = [Bk (k = 0,K,m)]( 1)´ 1 the column of their estimators. The normal system of equations in matrix form is then   vT × v × B = vT × Y , and its solution is
B = d vT  Y                                                       (4.1)
As usual, we shall suppose that they-s are independent normally distributed
i+K + bmxim,   var(yi) = s r 2eg
random  variables with   E(yi) = b0 + b 1 xi + K + bmxim,   var(yi) = s r2     Hence,  the
The Distribution of the Ratio…
107
vector Y  is normally distributed with E(Y) = vß,  var(Y) =    2s r egI. According to (4),        B
is
also
normally
distributed,
var(B) = (d × vT ) × var(Y) × (d × vT )T = s r2egd .
E(B) = (d × vT) × E(Y) = ß
,
Introduce two matrices: u = W    is    also    a   normal    variable   with
0-1    0 0     0m
and W = u × B =
2´(m+1)
E(W) = u × E(B) = u × ß =
r-b m
-Bm-1 mB
m-1 mbm
2´1
and
var(W) = u × var(B) × uT = s2regu × d × uT = s2reg
m-1,m-1                      m-1,m
m-1,m
m,m
.   Therefore,   the
distribution of W is   N
-b m-1
, mb
m         reg
"Y     m-1,m-1 ,      s reg\
m,m
,-
m-1,m
.
*\J     m-1,m-1    m,m
Hence,  Z has  a distribution with density function (2) with the exchange:
m Y
® m b m
,
s X ® s regd
m-1,m-1
,     sY®
ms
reg
Jd
m,m
,
r ®-
m-1,m
'Y     m-1,m-1    m,m
.
References
[1]   Jamnik, R. (1971): Verjetnostni racun. Mladinska knjiga, Ljubljana.
[2]   Johnson N.L, Kotz, S., and Balakrishnan, N. (1994): Continuous Univariate Distributions. 1. John Wiley and Sons.
[3]   Kamerud D. (1978):   The random variable X/Y, X, Y normal. The American Mathematical Monthly, 85, 207.
[4]   Marsaglia, G. (1965): Ratios of normal variables and ratios of sums of uniforms variables. JASA, 60, 163-204.
108
Anton Cedilnik, Katarina Košmelj, and Andrej Blejec
Appendix
m
a > 0    =^>     \t × exp(-at2 + bt + c)dt =
0
c                                                                                 j                c
=      [1-exp(bm-am2)] + bp×      × exp
2a
a   2a
 b
y4aj

F
b
2a
+
F
\M 2a j
m-yJ2a -
b
2a
"V 2 a J
jt × exp(-at2 +bt + c)dt = [ +    \t× exp(-at2 +bt + c)dt =
-¥                                                      0            0
where:   r =
ec
a
1 + r j (r)
b
V2a
1       - 2
F(r) =  j(x) dx =    erf
J

Vv2y
,