Metodološkizvezki,Vol. 14,No. 1,2017,19–35
TheDistributionoftheRatioofNormalRandom
VariablesandtheEllipticityoftheEarth
GregoryKordas
1
and GeorgePetrakos
2
Abstract
Inthispaperweconsiderinferenceregardingtheratiooftwonormalandtheratio
oftwot-distributedrandomvariables,usingboththepopularFiellerinterval,aswell
as, exact distributions. We apply these methods to a historical dataset regarding the
shapeoftheEarth,andestimatetheEarth’sﬂatnesscoefﬁcientasaratioofregression
coefﬁcients. Wedemonstratetheequivalenceoftheinferenceusingtheexactdensity
ofthisratiowiththatusingtheFiellerinterval.
1 Introduction
This paper grew out of our attempt to use a historical dataset to motivate least squares
estimation and inference in our graduate econometrics course. The dataset in question
concerns the shape of the Earth and consists of ﬁve observations of the length of 1
◦
of
latitude ﬁrst analyzed by Roger Boscovich in the 1750’s. The dataset and its story are
discussed in length in Stephen Stigler’s classic book The History of Statistics (Stigler,
1986: 39-50).
There are several reasons that make the analysis of this particular dataset interesting.
First, from a physical perspective it provided an early support to Newton’s theory of
gravitation. Second,fromastatisticalstandpoint,itillustratestheearlyhistoryofvarious
attemptstocombinenoisyobservationsinordertoobtainlessnoisyestimates. Third,the
parameters in question are known today, so the student can judge for him- or her-self the
effectiveness of the “scientiﬁc method”: careful collection of observations coupled with
soundstatisticalanalysiscanrevealthe“truth”.
The most challenging part of the analysis is to compute the distribution and obtain
conﬁdence intervals for the Earth’s ﬂatness coefﬁcient f that is deﬁned as the ratio of
the two coefﬁcients (slope/intercept) of a simple regression model. To obtain conﬁdence
intervals forf we use Fieller’s theorem, and to compute the entire distribution off we
usethetheratioofnormalsdistributionderivedbyMarsaglia(1965),aswellas,theratio
oft-distributedrandomvariablesderivedbyPress(1969). Weshowthatthetwomethods,
i.e. Fiellerintervalsandexactdistributioncomputation,yieldidenticalresults,thatis,we
show that the probability under the ratio density function between the Fieller bounds is
indeedequaltothespeciﬁedconﬁdencelevel. Giventhatthetwoliteratures,thatofFieller
1
DepartmentofPublicAdministration,PanteionUniversity,Athens,Greece;gkordas@panteion.gr
2
DepartmentofPublicAdministration,PanteionUniversity,Athens,Greece;petrakos@panteion.gr
20 G.KordasandG.Petrakos
intervalsandthatofexactdistributionsofnormalratios,havegrownalmostindependently
ofeachother,itis,tothebestofourknowledge,theﬁrsttimethatthetwoapproachesare
usedsimultaneouslyandareshowntoyieldidenticalresults.
Section 2 of this paper presents brieﬂy the historical background of the question of
the shape of the Earth, and discusses the dataset used by Boscovich. Section 3, presents
the OLS estimates along with Fieller intervals for the Earth’s ﬂatness coefﬁcient under
alternative assumptions. Section 4, takes up the question of computing the exact distri-
bution off and establishes the equivalence of the inference based on exact distributions
with that based on Fieller intervals. The sourceR code used in the empirical application
isprovidedinanAppendix.
2 TheEllipticityoftheEarth
One of the implications of Isaac Newton’s Theory of Universal Gravitation, introduced
in his epoch-making book Philosophia Naturalis Principia Mathematica, often referred
to as simply the Principia, was that, due to gravity, the rotation of the Earth around its
axiswouldcausetheEarthtobulgeattheequatorandﬂattenatthepoles. Moreprecisely,
Newton proved that a rotating self-gravitating ﬂuid body in equilibrium takes the form
of an oblate ellipsoid of revolution (a spheroid). The exact amount of ﬂattening depends
on the body’s density and the balance between the resulting gravitational and centrifugal
forces. If gravity is therefore operational, it is unlikely that the Earth is a perfect sphere,
butitshouldbean oblatespheroid,muchlikeanorange.
Newton’s theory was not the only theory around purporting to explain the motion
of the celestial spheres. The French mathematician and physicist René Descartes had
proposed the competing Theory of Vortices. According to Descartes’ theory, space is
ﬁlled with an invisible substance called the ether, that, much like water, creates vortices
that sweep the planets into their apparent orbits. Now, plastic spherical objects inside
a water vortex tend to ﬂatten at the equator and bulge at the poles, so if the theory of
Vortices was correct, the Earth should be a prolate spheroid, i.e., more like an egg or a
lemoninsteadofanorange.
The two competing theories led to a prolonged controversy, much along nationalistic
lines, between the English and their Continental rivals. Voltaire, who happened to be
visiting London when Newton died in 1727, was greatly impressed by the State funeral
and the honors bestowed on the great scientist, and, on the subject of the controversy,
commentedthat:
A Frenchman arriving in London ﬁnds things very different. [...] For us it
is the pressure of the Moon that causes the tides of the sea; for the English it
is the sea that gravitates towards the Moon. [...] In Paris you see the Earth
shapedlikeamelon,inLondonitisﬂattenedoutontwosides.
Theequationofa3Dellipsoidcenteredattheoriginwithsemi-axesa,bandcaligned
alongthecoordinateaxesis
x
2
a
2
+
y
2
b
2
+
z
2
c
2
= 1.
TheDistributionoftheRatioofNormalRandom... 21
     Q u i t o E q u a t o r L a p l a n d C a p e o f G o o d H o p e 
R o m e P a r i s Figure1: LocationofavailablemeasurementsontheWorldMap
Becauseplanetsrevolvearoundtheirnorth-southaxis,physicalconsiderationsdictatethat
planets are solids of revolution, i.e., solids that can be obtained by revolving a 2D curve
around thez (North-South) axis to obtain a 3D body. This means that for planetsa = b,
sotheequationbecomes
x
2
+y
2
a
2
+
z
2
c
2
= 1.
Ifc<a the ellipsoid is called oblate (orange-like), while ifc>a the ellipsoid is prolate
(lemon-like). Of course, ifc = a we get a perfect sphere. Our interest is in the quantity
ofpolar ﬂatteningor ellipticityf givenby
f =
a−c
a
= 1−
c
a
.
Lettinga = R
E
be the Earth’s equatorial radius andc = R
P
be its polar radius, we can
write
f = 1−
R
P
R
E
.
TheEarthisoblateiff > 0,prolateiff < 0,andsphericaliff = 0.
To settle the matter once and for all, in 1735 the Académie des Sciences Française
send expeditions to Ecuador, Lapland, and South Africa to measure meridians at widely
separated latitudes. Along with the pre-existing measurements from Paris and Rome, the
AcadémiemanagedtocollecttheﬁvedatapointsgiveninTable1(Stigler,1986).
Thelengthof1
◦
oflatitude`atthevariouslocationswasmeasuredintoise,apopular
measure of the time. To get a better idea, the table also presents these lengths in kilome-
ters. A simple inspection of the table makes it apparent that the length of 1
◦
of latitude
22 G.KordasandG.Petrakos
Table1: Dataonthelengthof1
◦
oflatitudeatvariouslocations
Location Latitudeθ sin
2
(θ)
Lengthof 1
◦
`[toises]
a
Lengthof 1
◦
`[km]
Quito,Ecuador 0
◦
0
0
0.0000 56,751 110.551
COGH
b
,S.Africa 33
◦
18
0
0.2987 57,037 111.108
Rome,Italy 42
◦
59
0
0.4648 56,979 110.995
Paris,France 49
◦
23
0
0.5762 57,074 111.180
Lapland,Finland 66
◦
19
0
0.8386 57,422 111.858
a
1toise=1948meters
b
CapeofGoodHope
grows as we move from the equator (θ = 0
◦
) to the poles (θ = 90
◦
). At Quito, which is
on the equator, the length of a degree was measured to be 110.551 km, while at Lapland,
whichistheclosestpeopleofthetimecouldgettothenorthpoleonaccountofthecold,
the length of the degree was measured to be more than a kilometer longer. Clearly these
data favor Newton’s prediction that the Earth ﬂattens at the poles. There are, however,
discrepancies too: at the Cape of Good Hope the length of a degree is longer than that at
Rome, despite the fact that Rome has a larger (north) latitude than the (south) latitude of
the Cape of Good Hope. Graphing these measurements we see that, with the exception
of the (Cape of Good Hope - Rome) pair, there seems to be a consistent tendency for the
lengthtogrowaswemoveawayfromtheequator.
Forshortarcs,theapproximation(seeStigler,1986)
` =β
0
+β
1
(3sin
2
θ)+higher-orderterms,
where` is the length of 1
◦
of latitude andθ is the angle of the latitude, was known to be
satisfactory. The parametersβ
0
andβ
1
can be interpreted as the length of 1
◦
of latitude
at the equator and the excess in length of 1
◦
at the poles over its value at the equator,
respectively. Ellipticityisthereforegivenbyf =β
1
/β
0
.
Theseparametersare, ofcourse, “known”today. Table2(fromAbramowitzandSte-
gun, 1972, p. 8.) presents the geodetic constants for the International Hayford Spheroid
(IHS). We see that the Earth ﬂattens at the poles with an average oblate ellipticity of
f = 1/297. The length of 1
◦
of latitude at the equator is 60×1,842.925 = 110,576 m,
whilethatatthepolesis60×1,861.666 = 111,700m. Thismeansthatthecircumference
of the Earth around the equator is approximately
2
C
E
= 39,807 km, while that around
the poles is approximatelyC
P
= 39,807× (1− 1/297) = 39,673 km, a mere 134 km
lessthanthataroundtheequator.
InBookIIIofthe Principia,Newtonhimselfpredictedthat:
[...] the diameter of the Earth at the equator is to its diameter from pole to
poleas230to229. – Principia,BookIII,PropositionXIX,ProblemIII.
2
Thatthecircumferenceoftheearthinkm’sisalmostaroundnumber(40,000km)isnotacoincidence.
The meter was originally deﬁned as the one ten-millionth (1/10,000,000) of the distance between a pole
and the equator along a great circle over water. Since to go around the earth one has to travel 4 times this
distance,thecircumferenceoftheearthis40millionmeters.
TheDistributionoftheRatioofNormalRandom... 23
3 sin^2(Latitude)
Length of 1 Degree of Latitude (km)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
110.5 111.0 111.5 112.0
Quito
Cape of Good Hope
Rome
Paris
Lapland
68.6 68.8 69.0 69.2 69.4 69.6
Length of 1 Degree of Latitude (miles)
Figure2: Graphof`against3sin
2
θ,alongwiththeleastsquaresline
Thatis, Newton gavef = 1/230. Fitzpatrick(2009: Section2.12), presents atheoretical
model of rotational ﬂattening that, under simplifying homogeneity assumptions about
the rotating body, predicts f = 1/233. He says that this is (essentially) the model that
Newtonusedtomakehisprediction,andcommentsthat“thediscrepancy[withtheactual
f = 1/297value]isduetothefactthattheEarthisstronglyinhomogeneous,beingmuch
denseratitscorethaninitsouterregions”.
Intermsofourmodel,β
0
= 110.576km, whilefromf weobtainapolarexceedance
ofβ
1
= 110.576/297 = 0.3723 km per 1
◦
of latitude. In what follows, we will estimate
β
0
,β
1
,andf fromthedatainTable1andseehowclosethescientistsofthetimecameto
discoveringthe“truth”.
3 EstimatingtheEllipticityoftheEarth
UsingthedatainTable1weestimatebyleastsquarestheregressionmodel
` =β
0
+β
1
(3sin
2
θ)+u.
AccordingtotheEarthparametersinTable2,the“true”equationis
` = 110.576+0.3723 (3sin
2
θ)
24 G.KordasandG.Petrakos
Table2: GeodeticConstants–HayfordInternationalSpheroid
a
a = 6,378,388m;c = 6,356,912m;f = 1/297
Latitude[
◦
]
Lengthof 1
0
of
longitude[m]
Lengthof 1
0
of
latitude[m]
Accelerationof
gravity[m/s
2
]
0 1855.398 1842.925 9.780350
15 1792.580 1844.170 9.783800
30 1608.174 1847.580 9.793238
45 1314.175 1852.256 9.806154
60 930.047 1856.951 9.819099
75 481.725 1860.401 9.828593
90 0.000 1861.666 9.832072
a
FromAbramowitzandStegun(1972),p. 8
sothat,
f = 0.3723/110.576 = 1/297 ;
C
E
= 110.576×360 = 39,807km.
TheOLSestimates(alongwiththeirs.e.’sinparenthesesbelowthem)are
ˆ
` = 110.525 + 0.4697 (3sin
2
θ), R
2
= 0.8773
(0.158) (0.1014) ˆ σ
u
= 0.1903
sothat,
ˆ
f = 0.4697/110.525 = 1/235.3 ;
ˆ
C
E
= 110.525×360 = 39,789km.
Thevariance-covariancematrixfor
ˆ
β isgivenby
V(
ˆ
β) = ˆ σ
2
u
(X
0
X)
−1
=

Var(
ˆ
β
0
) Cov(
ˆ
β
0
,
ˆ
β
1
)
Cov(
ˆ
β
1
,
ˆ
β
0
) Var(
ˆ
β
1
)

=

0.02481 −0.01344
−0.01344 0.01028

.
Giventhesmallnessofthesamplesize,theonlywaytoperforminferenceistoassume
that the neoclassical (normal and spherical errors) model applies. Assuming normal and
spherical errors, we see that
ˆ
β
0
is very statistically signiﬁcant with a t-statistic of 702,
while
ˆ
β
1
islesssigniﬁcant,butstillsigniﬁcantatthe5%level,withat-statisticof4.63(p-
value of 0.019 on 3 d.f.). Compared to the now known quantities, we see thatβ
0
andβ
1
,
and thereforeC
E
andf also, were estimated quite accurately by these data, and that the
data clearly support Newton’s prediction that the Earth bulges at the equator and ﬂattens
at the poles. Figure 3 presents individual 95% CI’s forβ
0
andβ
1
, as well as, a joint 95%
conﬁdence region for the two parameters. The joint conﬁdence region is given by the set
ofvaluesforwhichthequadraticform
1
2

110.525−β
0
0.4697−β
1

0

0.02481 −0.01344
−0.01344 0.01028

−1

110.525−β
0
0.4697−β
1

TheDistributionoftheRatioofNormalRandom... 25
109.8 110.0 110.2 110.4 110.6 110.8 111.0 111.2
0.0 0.2 0.4 0.6 0.8
β
0
β
1
Figure3: The95%jointconﬁdenceregionforβ
0
andβ
1
alongwiththemarginal95%CI’s.
Themarginal95%CIforβ
0
is[110.023,111.026],whilethatforβ
1
is[0.1470,0.7924]. The
nowknowntrueparametervaluesofβ
0
= 110.576andβ
1
= 0.3723arealsoshownaspoint
×.
islessthanorequaltothecriticalF
1−α,2,n−k
=F
.95,2,3
= 9.552value.
In order to obtain a ﬁrst approximate conﬁdence interval forψ = 1/f we resort to a
trick: wedividebothsidesby
ˆ
β
0
andestimatetheOLSregression
`
ˆ
β
0
= 1+
1
ψ
(3sin
2
θ)+
u
ˆ
β
0
.
ThisregressionhasthesameR
2
astheoriginalregressionandthet-statisticfor1/ψ isthe
sameasthet-statisticforβ
1
. Weget 1/
ˆ
ψ =.0042498withans.e. of.0009174,so,using
the t
.975,3
= 3.182 critical value, we obtain the 95% CI for 1/ψ as [.001330,.007169].
Uponinvertingweget
ˆ
ψ = 235.3asabove,andthe 95%CIforψ isgivenby
CI
naive
(ψ;.95) = [139.48,751.77].
TheCIisquitewide,butitimportantlycontainsonlypositivevaluesforψthatcorrespond
toaprolateEarth.
ThisCIforψ treatsβ
0
asknownandequaltotheestimatedvaluewithoutuncertainty,
i.e., it does not take into account the variability in
ˆ
β
0
. Since, in our application,β
0
is es-
timatedveryaccurately(i.e.,it’ss.e. isverysmallrelativetoitsmagnitude)thisomission
26 G.KordasandG.Petrakos
shouldnotmatteralot. Inanycase,thecorrect 95%CIshouldbewider
3
.
ToobtainthecorrectCIweuseFieller’s(1944)theorem. ThefollowingAsidepresents
themethodasitisadaptedtotheGeneralLinearModelbyZerbe(1978).
Aside(Fieller’sTheorem)(Zerbe,1978)Let
ψ =Kβ/Lβ,
whereK andLare1×k vectorsofknownconstants,betheratiooftwolinearcombina-
tions of ak×1 parameter vectorβ. If an estimator
ˆ
β is distributed as
ˆ
β ∼ N(β,Σ), we
havethat,fora
√
n-estimator
ˆ
Σof Σ,
T =
K
ˆ
β−ψL
ˆ
β
h
K
ˆ
ΣK
0
−2ψK
ˆ
ΣL
0
+ψ
2
L
ˆ
ΣL
0
i
1/2
∼t
n−k
.
Lettingt =t
1−α/2,n−k
bethecriticalvaluefromthet
n−k
distribution,wehave
1−α = Pr{−t≤T ≤t} = Pr{T
2
−t
2
≤ 0} = Pr{aψ
2
+bψ +c≤ 0},
where,
a = (L
ˆ
β)
2
−t
2
L
ˆ
ΣL
0
,
b = 2
h
t
2
K
ˆ
ΣL
0
−(K
ˆ
β)(L
ˆ
β)
i
,
c = (K
ˆ
β)
2
−t
2
K
ˆ
ΣK
0
.
The last expression says that the interval containing the required 1−α probability is
characterizedbythevaluesforwhichthebinomialaψ
2
+bψ+cisnegative. Ifaispositive,
thefunctionisconvexandtakesnegativevalues. If,furthermore,thediscriminantb
2
−4ac
is alsopositive, the binomial hastwo distinct realroots that deﬁne therequired CI. Thus,
the 100(1−α)%CIforψ isgivenby

−b−
√
b
2
−4ac
2a
,
−b+
√
b
2
−4ac
2a

,
provided thata > 0 andb
2
−4ac > 0. For the pathologicala < 0 and/orb
2
−4ac < 0
cases, as well as, for a nice geometrical interpretation of Fieller’s theorem, see Luxburg
andFranz(2009).
In our application, ψ = β
0
/β
1
, and
ˆ
β is the OLS coefﬁcient that, under the normal
and spherical errors assumption, is distributed asN(β,σ
2
u
(X
0
X)
−1
). LettingK = (1,0)
andL = (0,1), we can writeψ = Kβ/Lβ as required. Then, usingt = t
.975,3
= 3.182,
wecompute
a = 0.1165> 0, b =−104.1, c = 12,215.4
3
Note that we can obtain a correct CI for ψ by evaluating the Sum of Square Residuals on a grid of
valuesforβ
0
toobtaintheproﬁlelikelihoodforψ. Then,uponinvertingtheLR-teststatisticwecanobtain
anasymptoticallycorrectCI.Toavoidconfusingthediscussion,wedonotpursuethisalternativehere.
TheDistributionoftheRatioofNormalRandom... 27
0 200 400 600 800
−10000 0 5000 10000
ψ
Fieller Binomial
Figure4: TheFiellerbinomialaψ
2
+bψ +c(thinline)that,whennegative,deﬁnesthe
Fieller95%CIforψ (boldline). Thepointestimate
ˆ
ψ = 235.3isalsoshown.
and
b
2
−4ac = 5,144.7> 0,
andtheFieller(correct) 95%CIforψ isgivenby
CI
Fieller,t
(ψ,.95) = [138.95,754.66]. (3.1)
As expected, since β
0
is estimated quite accurately here, this correct CI for ψ (that
takesintoaccountthevariabilityinboth
ˆ
β
0
and
ˆ
β
1
)isonlymarginallywiderthanthenaive
CI we computed above. Note that both CI’s are asymmetric around the point estimate
ˆ
ψ = 235.3 (see Figure 4), with the longer tail towards largeψ’s that correspond to a less
oblateandmoresphericalEarth.
For comparison purposes, we can also treat the means, variances and correlation as
known, so that the ratiof = β
0
/β
1
is a ratio of normals standardized by their true vari-
ances and correlation (see Section 4). This simply amounts to usingt =z
α/2
= 1.96, so
that
a = 0.1811> 0, b =−103.9, c = 12,215.6
and
b
2
−4ac = 1,951.4> 0,
whichyieldsthemuchshorterinterval
CI
Fieller,normal
(ψ,.95) = [164.96,408.84]. (3.2)
Yet a fourth, albeit only asymptotically valid, method of obtaining a CI forψ is the
δ-method,whichproducess.e.(
ˆ
ψ) = 51.08,sothe 95%CIisgivenby
CI
δ−method
(ψ,.95) = 235.3±t
.975,3
51.08 = [72.75,397.86].
28 G.KordasandG.Petrakos
This CI is symmetric around the point estimate, but it disagrees with the correct Fieller-t
CIenormouslybecauseitdisregardstheextremesmallnessofthesamplesize.
Comparedtothenowknownquantities,theestimatesobtainedfromthedatainTable
1 were indeed quite accurate. When published in the mid 1700’s, these data lent con-
siderable support to Newton’s Theory of Gravitation. One could indeed compare this
veriﬁcation of Newton’s predictions to the experimental veriﬁcations of Einstein’s Gen-
eralTheoryofRelativityinthe20thcentury,thatwasfoundtocorrectlyaccountformany
anomaliesthatwereleftunexplainedbyNewtonianmechanics,themostfamousofwhich
aretheadvanceofMercury’sperihelionandthebendingoflightrayspassingthroughthe
gravitationalﬁeldoftheSun.
4 RatiosofNormalandtRandomVariables
In the previous section we obtained several CI’s forψ under alternative assumptions. In
this section we wish to compute the entire distribution of
ˆ
ψ under the same assumptions
andverifythatthetwoapproachesyieldidenticalresults. Westartwiththecaseinwhich
the parameters (means, variances and correlation) of the joint normal distribution are
known, and then discuss how the analysis should be modiﬁed if these parameters are
treatedonlyasestimatesfromasmallsample,asitisthecasehere.
The distribution of the ratio of normal random variables has been derived by several
authors, including Marsaglia (1965), Hinkley (1969), and Cedilnik, Ko˘ smelj, and Blejec
(2004). In our developments below we use the form given by Marsaglia (1965) because
hisresultwasextendedbyPress(1969)toratiosoft-distributedrandomvariablesthatwe
willalsoneed.
Marsaglia(1965)consideredthedistributionoftheratio
R =
Z +a
W +b
, a≥ 0,b≥ 0, (4.1)
wherea,b are nonnegative real constants andZ andW are independent standard normal
randomvariables. ThefollowinglemmagivesthedistributionofR.
Lemma1 (Marsaglia, 1965) IfZ andW are independent normal random variables, the
probability density function of the ratioR in (4.1) is given by
f
R
(t) =
e
−(a
2
+b
2
)/2
π(1+t
2
)

1+
q[2Φ(q)−1]
2φ(q)

, −∞<t<∞, (4.2)
where,
q =
at+b
√
1+t
2
,
andφ(·)andΦ(·)arethestandardnormaldensityanddistributionfunctions,respectively.
Press(1969)consideredagaintheratioin(4.1),onlythistimeheassumedthatZ and
W areindependentt
ν
randomvariables.
TheDistributionoftheRatioofNormalRandom... 29
Lemma2 (Press,1969)IfZ andW areindependentt
ν
randomvariables,theprobability
density function of the ratioR in (4.1) is given by
g
R,ν
(t) =
k
1
1+t
2

1+
k
2
q
q
∗ν+1

2F
ν+1

q
√
ν +1
q
∗

−1

, −∞<t<∞,
where F
n
is the c.d.f of the Student t-distribution with n degrees of freedom, q is as in
Lemma 1,q
∗
=
p
a
2
+b
2
+ν−q
2
and
k
1
=
1
π

1+
a
2
+b
2
ν

−ν/2
, k
2
=
√
πν
ν+2
Γ

ν +1
2

2Γ

ν +2
2

1+
a
2
+b
2
ν

−ν/2
.
Press (1969) shows that asν →∞,g
R,ν
(t) converges tof
R
(t). Both results in Lem-
mas 1 and 2 provide the distribution for uncorrelated standardized variables, with only
their means a and b being nonzero. To use these distributions in applications we need
a way of transforming arbitrary jointly normal or jointly t-distributed random variables
to the form considered in (4.1). Marsaglia (1965) states that there are constants c
1
and
c
2
for which the ratio of arbitrary jointly normal variables X and Y, can be written as
X/Y =c
1
+c
2
(Z +a)/(W +b),i.e.,thatthedistributionofX/Y isatranslationofthat
ofR. AfterwehadderivedthetransformationourselveswefoundthepaperbyMarsaglia
(2006) that gives exactly the same formulas. In the hope that our derivation of the for-
mulas is more transparent than that of Marsaglia, we include here a brief account of our
derivationoftheconstantsc
1
,c
2
,aandb.
It is a well-known fact that if U
1
and U
2
are independent N(0,1) random variables
andρ∈R suchthat|ρ|< 1,then
Z
1
=U
1
and Z
2
=ρU
1
+
p
1−ρ
2
U
2
,
areBVN(0,0,1,1,ρ). Also, forμ
1
,μ
2
∈ R andσ
1
> 0,σ
2
> 0,X
1
= μ
1
+σ
1
Z
1
and
X
2
=μ
2
+σ
2
Z
2
areBVN(μ
1
,μ
2
,σ
1
,σ
2
,ρ). Now letZ andW be independentN(0,1)
randomvariablesandconsidertherotations
X =σ
x
ρ(W +b)+σ
x
p
1−ρ
2
(Z +a) and Y =σ
y
(W +b),
wherea andb are nonnegative constants and|ρ| < 1. Then,X andY are jointly normal
withvariancesσ
2
x
andσ
2
y
andcorrelationρ. Theirratiois
X
Y
=
σ
x
ρ
σ
y
+
σ
x
p
1−ρ
2
σ
y

Z +a
W +b

,
asrequired,with
c
1
=
σ
x
σ
y
ρ and c
2
=
σ
x
σ
y
p
1−ρ
2
. (4.3)
The means ofX andY areμ
x
= bσ
x
ρ+aσ
x
p
1−ρ
2
andμ
y
= σ
y
b, respectively, from
whichweobtain,
a =
μ
x
/σ
x
−ρμ
y
/σ
y
p
1−ρ
2
and b =
μ
y
σ
y
. (4.4)
30 G.KordasandG.Petrakos
SincethedistributionsofZ and−Z andW and−W arethesame,ifaandbhavethe
samesign,c
1
andc
2
canbetakenasgivenin(4.3). If,however,onlyoneoftheconstants
aorbispositive,wetakec
2
=−σ
x
p
1−ρ
2
/σ
y
.
Wehavethefollowinglemma.
Lemma3 LetX andY bejointlynormalrandomvariableswithmeansμ
x
,μ
y
,variances
σ
x
,σ
y
and correlationρ, and letR
0
=X/Y be their ratio.
1. ThedistributionofR
0
isunaffectedifμ
x
,μ
y
,σ
x
andσ
y
areallrescaledbyacommon
positive factor.
2. If f
R
(t) is the density of R in (4.1) where a and b are as given in (4.4) then the
density ofR
0
is given by
f
R
0(t) =
1
c
2
f
R

t−c
1
c
2

, −∞<t<∞,
wherec
1
andc
2
are given in (4.3).
Part1. ofthelemmasaysthat,asfarasratiosareconcerned,oftheﬁveparametersina
bivariate normal distribution, only four are free: the ratiosR
1
=X
1
/Y
1
from (X
1
,Y
1
)∼
BVN(μ
x
,μ
y
,σ
x
,σ
y
,ρ)andR
2
=X
2
/Y
2
from(X
2
,Y
2
)∼BVN(λμ
x
,λμ
y
,λσ
x
,λσ
y
,ρ),
whereλ> 0,havethesamedistribution. ThisanswersHinkley’s(1969)objectionthatin
thegeneralproblemthereareﬁveparametersbutMarsaglia(1965)usesonlyfour,namely
(a,b,c
1
,c
2
).
Part 2. of the lemma gives the relation between the two sets of parameters and yields
some interesting insights into the problem. The location and scale constants c
1
and c
2
dependonlyontheparametersofthecovariancematrixofX andY,andalthoughneces-
sary to pin down the exact distribution ofR for each set of parameters, they are not very
interesting in terms of the various shapes that this distribution assumes. As Marsaglia
(1965) asserted, the shape is indeed determined by the constantsa andb, which we will
call “the numerator and denominator standardized means”, respectively: b is indeed the
standardizedmeanμ
y
/σ
y
ofthedenominatorvariableY,whileaisthestandardizedmean
of the numerator variableX, albeit adjusted for correlation ifρ6= 0. For example, since,
forμ
y
6= 0,a = 0ifandonlyif (μ
x
/σ
x
)/(μ
y
/σ
y
) =ρ,andsincec
1
andc
2
donotdepend
onμ
x
andμ
y
,theratioR
1
from(X
1
,Y
1
)∼BVN[μ
x
=m,μ
y
6= 0,σ
x
,σ
y
,ρ = 0]hasthe
same distribution as the ratioR
2
from (X
2
,Y
2
)∼ BVN[μ
x
= m/σ
x
−ρμ
y
σ
x
/σ
y
,μ
y
6=
0,σ
x
,σ
y
,ρ]. So,althoughthestatement“bissmall,large,orzero”isequivalentto“μ
y
/σ
y
is small, large, or zero, respectively”, the statement “a is small, large or zero” should be
interpreted as: “there is an equivalent model (shifted byc
1
and scaled byc
2
) withρ = 0
forwhicha =μ
x
/σ
x
issmall,largeorzero,respectively”.
RecallthatU followsaCauchyC(ξ,ν)distributionif
g
U
(u) =
(
πν
"
1+

u−ξ
ν

2
#)
−1
and G
U
(u) =
1
2
+
1
π
tan
−1

u−ξ
ν

, (4.5)
for−∞ < u < ∞. The following corollary characterizes the Cauchy densities that are
includedinf
R
(t).
TheDistributionoftheRatioofNormalRandom... 31
Corollary1 Taken together, Lemmas 1 and 3 yield that:
1. TheratioRoftwocenteredjointlynormalrandomvariables(X,Y)∼BVN(μ
x
=
0,μ
y
= 0,σ
x
,σ
y
,ρ) is distributed asC(ξ =c
1
,ν =c
2
).
2. The distribution of the ratio R of two jointly normal random variables for which
eitherμ
x
6= 0 and/orμ
y
6= 0, does not belong to the CauchyC(ν,ξ) family.
The familiar result that the ratio of two independent standard normal variables is dis-
tributed as standard CauchyC(0,1) is a special case of (i). Note also that by (i),C(0,1)
is also the distribution of the ratio of any independent centered jointly normal random
variables with equal variances. Furthermore, the general variances and correlation with
zero means, completely exhaust the Cauchy location-scale family: the negative result in
part(ii)meansthatthedensitiesgraphedbyMarsagliaandotherauthorsfora6= 0and/or
b 6= 0, although Cauchy-like in some cases, do not belong to theC(ξ,ν) location-scale
family.
In the form given in (4.2), the density of R is numerically unstable for large a and
b. To see this, note that whena and/orb is largee
−(a
2
+b
2
)/2
becomes very small, while
1/φ(q) in the expression inside the brackets becomes very large, leading to 0× inﬁnity
computations in terms of ﬂoating-point computer accuracy. Also, as given, the normal
approximation in eq. (4.2) is not apparent. To make the density numerically stable and
revealthenatureofthenormalapproximationlet
h =
bt−a
√
1+t
2
,
anduse
2πφ(h)φ(q) =e
−(a
2
+b
2
)/2
,
torewritethedensityin(4.2)as
f
R
(t) =
q
1+t
2
φ(h){2Φ(q)−1}+
e
−(a
2
+b
2
)/2
π(1+t
2
)
(4.6)
=φ(h)

q{2Φ(q)−1}+2φ(q)
1+t
2

. (4.7)
Either of the above representations are completely general and have the added advantage
that they can be easily evaluated without numerical problems for anya andb no matter
howlargeorsmalltheyhappentobe.
Toseewhathappensasbgetslarge,orasa andbgetsmall,let
f
∗
R
(t) =
q
1+t
2
φ(h) =
dh
dt
φ(h) =
d
dt
Φ(h),
andwrite
f
R
(t) =f
∗
R
(t)×δ(t),
where
δ(t) ={2Φ(q)−1}+
2φ(q)
q
.
32 G.KordasandG.Petrakos
0 200 400 600 800 1000
0.000 0.002 0.004 0.006 0.008
t
Density
Figure5: Thedistributionof
ˆ
ψ underbivariatenormalityof
ˆ
β
0
and
ˆ
β
1
(solidline)andunder
bivariatet-distributionwith3degreesoffreedom(dottedline). Alsoshownarethe
corresponding95%CI’sineachofthetwocases,thatis,[164.96,408.84]undernormality,
and[138.95,754.66]undert-distribution.
Asb becomes large, the ﬁrst term ofδ(t) goes to one and the second term goes to zero
so, for large b, f
R
(t) is approximated by the normal density f
∗
R
(t) in (16), and f
R
0(t)
by the shifted and rescaled according to Lemma 3 normal densityf
∗
R
0(t). Note that this
approximationisvalidirrespectiveofthevalueofa. Ontheotherhand,asbothaandbgo
to zero,f
R
(t) goes to the standard Cauchy density andf
R
0(t) goes to theC(ξ,ν) density
in(4.5)withξ =c
1
andν =c
2
. Thisisobviousfromeither(4.2)or(4.6)andLemma3.
To use these results in our application, we compute the four parameters in (4.3) and
(4.4)andobtain:
a = 1305.7, b = 4.632, c
1
=−1.307, and c
2
= 0.8395,
from which we obtain the distributions in Figure 5. Note that for these large values of
a andb, the density in (4.2) cannot be evaluated due to the above-mentioned numerical
problems, but the alternative forms in (4.6) and (4.7) can be evaluated without any prob-
lems. The solid line in Figure 5 corresponds to the Marsaglia distribution and the solid
verticallinesdeﬁnetheFieller-normal95%CIin(3.2). UsingnumericalintegrationinR,
we compute the area under the curve between the bounds in (3.2) and ﬁnd that the area
is indeed 95%: Using the marsaglia() function inR given in the Appendix and the
TheDistributionoftheRatioofNormalRandom... 33
OLSestimatesfromSection3tospecify par = (μ
x
,μ
y
,σ
x
,σ
y
,ρ),weobtain
> par <- c(110.525,0.4697,0.15751,0.10140,-0.84139)
> mars <- function(x){marsaglia(x,par)}
> integrate(mars,164.96,408.84)
0.9499881 with absolute error < 6e-07
Similarly, the dotted line in Figure 5 corresponds to the Press distribution and the
dottedverticallinesdeﬁnetheFieller-t95%CIin(3.1). Usingagainnumericalintegration
inR,wecomputetheareaunderthedottedcurvebetweentheboundsin(3.1)andﬁndthat
the area is also 95% as expected: Using the press() function given in the Appendix,
thesame parvectorand nu = 3d.f. wecompute
> pres <- function(x){press(x,par,3)}
> integrate(pres,138.95,754.66)
0.9499949 with absolute error < 1.2e-07
WeconcludethatinferencebasedonFiellerintervalsandthatbasedonexactdistribution
computationsyieldidenticalresults.
Appendix: SourceR Code for Fieller Intervals and Ratio
Densities
The followingR functions compute the densities discussed in the main text. The func-
tionmarsaglia() computes the ratio of normals density in (4.10), while the function
press() computes the ratio of t’s density in Lemma 2. Their inputs are z, a vector
of values over which the density is to be evaluated, and par a 5× 1 vector containing
(μ
x
,μ
y
,σ
x
,σ
y
,ρ), in that order. Functionpress() also takes the single-value parame-
ter nu, the degrees of freedom of thet variables. The parametersa,b,c
1
,c
2
in (4.3) and
(4.4)arecomputedinternallyfrom par.
marsaglia <- function(z,par){
mx <-par[1];my<-par[2];sx<-par[3];sy<-par[4];r<-par[5]
vx <-sx^2;vy<-sy^2
a <- mx/(sx
*
sqrt(1-r^2))-(r
*
my)/(sy
*
sqrt(1-r^2)); b<-my/sy
c1 <- (sx
*
r)/sy; c2 <-(sx
*
sqrt(1-r^2))/sy
zn <- (z-c1)/c2
q <- (b+a
*
zn)/sqrt(1+zn^2)
h <- (b
*
zn-a)/sqrt(1+zn^2)
f1 <- b/sqrt(1+zn^2)
*
dnorm(h)
f2 <- q
*
(2
*
pnorm(q)-1)/(b
*
sqrt(1+zn^2))
f3 <- exp(-.5
*
(a^2+b^2))/(pi
*
(1+zn^2))
prob <- (1/c2)
*
(f1
*
f2+f3)
return(prob)
}
press <- function(z,par,nu){
mx <-par[1];my<-par[2];sx<-par[3];sy<-par[4];r<-par[5]
34 G.KordasandG.Petrakos
vx <-sx^2;vy<-sy^2
a <- mx/(sx
*
sqrt(1-r^2))-(r
*
my)/(sy
*
sqrt(1-r^2)); b<-my/sy
c1 <- (sx
*
r)/sy; c2 <-(sx
*
sqrt(1-r^2))/sy
zn <- (z-c1)/c2
k1 <- 1/(pi
*
(1+(a^2+b^2)/nu)^(nu/2))
k2 <- (sqrt(pi)
*
nu^((nu+2)/2)
*
gamma((nu+1)/2))/
(2
*
gamma((nu+2)/2)
*
(1+(a^2+b^2)/nu)^(-nu/2))
q1 <- -(a
*
zn+b)/sqrt(1+zn^2)
q2 <- sqrt(a^2+b^2+nu-q1^2)
prob <- (1/c2)
*
((k1/(1+zn^2))
*
(1+(k2
*
q1/q2^(nu+1))
*
(2
*
pt(q1
*
sqrt(nu+1)/q2,nu+1) -1)))
return(prob)
}
The followingR code computes the OLS estimates and implements Zerbe’s (1978)
algorithmtocomputethe 95%Fieller-tintervalforψ,forthedatainTable1.
X <- cbind(c(1,1,1,1,1), 3
*
c(0,.2987,.4648,.5762,.8386))
y <- c(110.551,111.108,110.995,111.18,111.858)
bhat <- solve(t(X)%
*
%X)%
*
%t(X)%
*
%y
print(bhat)
[,1]
[1,] 110.5245067
[2,] 0.4697037
yhat <- X%
*
%bhat; uhat<-y-yhat; shat<-sqrt(sum(uhat^2)/3)
Vhat <- shat^2
*
(solve(t(X)%
*
%X))
print(Vhat)
[,1] [,2]
[1,] 0.02480802 -0.01343743
[2,] -0.01343743 0.01028128
psi <- bhat[1]/bhat[2]
print(psi)
[1] 235.3069
K <- matrix(c(1,0),1,2); L <- matrix(c(0,1),1,2)
a1 <- (L%
*
%bhat)^2 - qt(.975,3)^2
*
(L%
*
%Vhat%
*
%t(L))
a2 <- 2
*
(qt(.975,3)^2
*
(K%
*
%Vhat%
*
%t(L))-(K%
*
%bhat)
*
(L%
*
%bhat))
a3 <- (K%
*
%bhat)^2-qt(.975,3)^2
*
(K%
*
%Vhat%
*
%t(K))
low <- (-a2-sqrt(a2^2-4
*
a1
*
a3))/(2
*
a1)
up <- (-a2+sqrt(a2^2-4
*
a1
*
a3))/(2
*
a1)
print(c(low,up))
[1] 138.9486 754.6641
TheDistributionoftheRatioofNormalRandom... 35
Acknowledgment
The authors are grateful to the editors and the two anonymous referees for their valuable
commentsthathavesigniﬁcantlyimprovedthequalityandpresentationofthispaper.
References
[1] AbramowitzM., and Stegun, I.A. (1972): Handbook of Mathematical Tables, New
York: Dover.
[2] Cedilnik, A., Košmelj, K., and Blejec, A. (2004): The distribution of the ratio of
jointlynormalvariables. Metodolo˘ ski zvezki,1,99–108.
[3] Fieller, E.C. (1944): A fundamental formula in the statistics of biology assay and
someapplications.QuartelyJournalofPharmacyandPharmacology,17,117–123.
[4] Fitzpatrick R. (2009): Theoretical Fluid Mechanics. Austin: University of Texas at
AustinPress.
[5] Hinkley, D.V. (1969): On the ratio of two correlated normal random variables.
Biometrika,56,635–639.
[6] Luxburg, von U., and Franz, V.H. (2009): A geometric approach to conﬁdence sets
for ratios: Fieller’s theorem, generalizations and bootstrap. Statistica Sinica, 19,
1095–1117.
[7] Marsaglia, G.M. (1965): Ratios of normal variables and ratios of sums of uniform
variables. Journal of the American Statistical Association,60,193–204.
[8] Marsaglia,G.M.(2006): Ratiosofnormalvariables.JournalofStatisticalSoftware,
16,1–10.
[9] Press,S.J.(1969): Thetratiodistribution.JournaloftheAmericanStatisticalAsso-
ciation,64,242–252.
[10] Stigler, S.M. (1986): The History of Statistics, Cambridge: Harvard University
Press.
[11] Zerbe, G.O. (1978): On Fieller’s theorem and the general linear model. The Ameri-
can Statistician,32,103–105.