Image Anal Stereol 2004;23:185-198 Original Research Paper
SPATIAL EXTRAPOLATION OF ANISOTROPIC ROAD TRAFFIC DATA
Hans Braxmeier1,2, Volker Schmidt2 and Evgueni Spodarev2
1Department of Applied Information Processing, University of Ulm, D-89069 Ulm, Germany, 2Department of
Stochastics, University of Ulm, D-89069 Ulm, Germany
e-mail: hans.braxmeier@mathematik.uni-ulm.de, schmidt@mathematik.uni-ulm.de,
spodarev@mathematik.uni-ulm.de
(Accepted July 14, 2004)
ABSTRACT
A method of spatial extrapolation of traffic data is proposed. The traffic data is given by GPS signals over downtown Berlin sent by approximately 300 taxis. To reconstruct the traffic situation at a given time spatially, i.e., in the form of traffic maps, kriging with moving neighborhood based on residuals is used. Due to significant anisotropy in directed traffic data, the classical kriging has to be modified in order to include additional information. To verify the extrapolation results, test examples on the basis of a well-known model of stochastic geometry, the Boolean random function are considered.
Keywords: anisotropy, asymptotic Gaussian test, Boolean model, kriging, moving neighborhood, random field.
INTRODUCTION
A common difficult problem of large cities with heavy traffic is the forecasting of traffic jams. In this paper, a first step towards mathematical traffic forecasting, namely the spatial reconstruction of the present traffic situation from point measurements is done. To describe the traffic states, models of stochastic geometry and spatial statistics (or geostatistics) are used. A corresponding Java software that implements efficient algorithms of spatial extrapolation is developed.
This research is based on real traffic data originating from downtown Berlin. They were provided by the Institute of Transport Research of the German Aerospace Center (DLR). Approximately 300
test vehicles (taxis) were equipped with GPS sensors transmitting their geographic coordinates, velocity and status line (e.g., “free”, “hired”, “at the taxi rank”, etc.) to a central station within regular time intervals from 30 sec. up to 3 min. The regularity of these signals depends on the taxi’s status. Thus, a large data base of more than 13 million positions was formed since April 2001 (see Fig. 1).
In the present paper, a smaller data set (taxi positions on all working days from 30.09.2001 till 19.02.2002, 5.00–5.30 p.m., moving taxis only) is considered. The observation window was reduced to downtown Berlin in order to avoid inhomogeneities in the taxi positions. To study traffic jams, the rush hour (5.00–5.30 p.m.) was chosen.
Fig. 1. Observed positions of test vehicles in downtown Berlin.
185
Braxmeier H et al: Spatial extrapolation of road traffic data
To produce road traffic maps, the velocities of all vehicles at time t are assumed to be induced by a realization of a spatial random field V(t) = {V(t,u)} where V(t,u) is a traffic velocity vector at position ueR2 and time t > 0. The spatial structure of such random velocity fields makes the analysis of traffic-jam mechanisms possible. Thus, the spatial localization of traffic jams can be obtained by a threshold operation on the grey-scale image of the map of velocities V(t,u): a point u lies within the traffic jam region at time t if \V(t,u)\ is smaller than a given threshold value, e.g., 15kph.
Since V(t,u) can be measured just pointwise at observation points ui, a spatial extrapolation of the observed data is necessary. Notice that the velocities strongly depend on the movement directions, e.g., the speed limits and consequently the mean velocities are higher on motorways than in downtown streets. Furthermore, the formation of traffic jams is also directional since a vehicle can influence only those vehicles moving behind it along the same road in the same direction. Moreover, the traffic speed at position u clearly depends on the traffic direction on the road, e.g., in directions of the city center or suburbs.
The classical extrapolation methods of geostatistics such as the ordinary kriging (see, e.g., Stoyan et al, 1997, Wackernagel, 1998) either make no use of additional information or provide measurements V(t,u + ui) and V(t,u-ui) with equal weights. Both these features are not relevant to the above problem setting. An extrapolation method designed for directional data, the so-called complex cokriging of velocities and their directions (see, e.g., Wackernagel, 1998) cannot be used here as well since there is no one-to-one correspondence between measurement positions u and traffic directions. An obvious counterexample is a crossroads. Thus, the standard extrapolation methods had to be adapted to our specific problem. Therefore, a modified ordinary kriging with moving neighborhood is described that allows to extrapolate directed velocity fields. First, the original data set should be split into N directionally homogeneous subsets. A data unit (u,V(t,u)) belongs to the data set i (i = 1,..., N) if the polar angle of the vector V(t,u) lies within the directional sector
Si = [2n(i-1)/N,2ni/N).
By convention, the zero polar angle corresponds to the eastward direction on the city map. Throughout this paper, we put N = 4. From the practical point of view, this is sufficient for the separation of opposite traffic directions and, simultaneously, keeps the amount of resulting data sets small. Nevertheless, in principle, any other N^4 could be used instead.
The above data sets should be extrapolated separately from each other. This yields N velocity maps corresponding to N directional sectors.
In what follows, the data from a given time interval [t1,t2] will be taken for extrapolation. To be precise, we put t1 = 5.00 p.m. and t2 = 5.30 p.m. Keeping this in mind, we shall omit the time parameter t in further notation. The observed velocities are not spatially homogeneous. Hence, the mean velocity field {m(u)} obtained by averaging the traffic velocities over all working days from 5.00 p.m. till 5.30 p.m. should be considered. As far as this mean field is subtracted from the original data, the deviations of actual velocity values are extrapolated in order to create the spatial field of velocity residuals.
This extrapolation method has been implemented in Java. Thus, a software library was developed comprising the estimation and fitting of variograms as well as the ordinary kriging with moving neighborhood. As far as it is known to the authors, it is the first complete implementation of such kriging methods in Java. An advantage of the Java programming language lies in its platform independence. Great attention was paid to the efficient implementation of fast algorithms. In contrast to classical geostatistics operating with relatively small data sets, this efficiency is of great importance for larger data sets with more than 10000 entries. For instance, the Java package for variogram fitting described in Faulkner, 2002 cannot be used for data sets with more than 1000 entries due to unacceptable runtimes. Efficient image processing and computational algorithms (see, e.g., Mayer et al., 2004) enabled us to drastically reduce the runtimes of the Java library.
The extrapolation method itself as well as the software quality are verified on the test example of a Boolean random function; see Serra, 1988. Remarkable features of this model are its simplicity of simulation and nice analytical description. For test purposes, 90 independent realizations of a Boolean model with a deterministic drift have been simulated. The quality of extrapolation is proved by means of statistical significance tests of the area fraction. It is shown that extrapolated images perfectly retain the essential structure of original test images.
This justifies the application of the above method to traffic data. First, the mean velocity fields are estimated for all directional sectors. Then, the deviations from the mean of actual speed values are extrapolated for particular days and time intervals. On their basis, traffic-jam maps are created; see Figs. 19– 21.
186
Image Anal Stereol 2004;23:185-198
There are several interesting perspectives for further research. In particular, using methods recently developed in Heinrich et al., 2004, Klenk et al., 2004, and Schmidt and Spodarev, 2004, models of stochastic geometry can be statistically fitted to extrapolated traffic maps. In the next step, the fitted models can be used in order to predict future traffic states on the basis of currently incoming traffic data. Such spacetime prediction models as well as their applications to forecasting of traffic states will be discussed in a forthcoming paper.
SOME PRELIMINARIES
RANDOM FIELDS
To model traffic maps, non-stationary random fields composed of a deterministic drift and an intrinsically stationary random deviation field, the so-called residual, are used. See, e.g., monographs Cressie, 1993 and Wackernagel, 1998 for details.
Cressie, 1993, p. 190. The main idea of the method is straightforward. First of all, an estimator m(u) for the drift m(u) has to be constructed. Then, the deviation field Y* = {Y*(u), ueR2} defined by
Y*(u)=X(u)-m(u)
(2)
is formed and its kriging estimator Y*(u) is computed. Finally, the estimator X(u) is given by
X(u) =
m\u)
Y*(u).
(3)
If we suppose that the drift is known, i.e., m(u) = m(u) for all u then we know the exact values Y(u!),...,Y(un) of the deviation field at uh...,un since
Y*(u)=Y(u)=X(u)-m(u).
Let
y(ui)=x(ui)-m(ui),
i
1,...,n
DRIFT AND DEVIATION FIELD
Let X = {X(u),u G R2}  be  a non-stationary random field with finite second moments
E[X(u)2] <
00 ,
GR2.
Then X can be decomposed into a sum
X(u) = m(u)+Y(u)
where m(u) = E[X(u)] is the mean field (drift) and Y(u) = X(u)-m(u) is the deviation field from the mean or residual. Clearly, it holds E[Y(u)] = 0 for all u. Assume that Y is intrinsically stationary of order two. Denote by
7(h)
1 E[(Y(u)-Yu+h)) 2 ]
(1)
its variogram function. In practice, the field X can be observed in a compact (say, rectangular) window W C R2. Let x(u),... ,x(un) be a sample of observed values of X, ut G W for all i. The extrapolation method described in the next section yields an “optimal” estimator X(u) of the value of X(u) for any u G W based on the sample random variables X(ui),... ,X(un). Among the variety of extrapolation techniques for non-stationary random fields (see, e.g., the universal kriging in Cressie, 1993, Kitanidis, 1997, Wackernagel, 1998), our approach is similar to the so-called kriging based on the residuals; see
be a realization of the sample values of Y. The extrapolation of Y(u) can be performed either by simple kriging based on the covariance function
C(h) = E[Y(u)Y(u+h)]
or by ordinary kriging making use of the variogram ?(h); see Cressie, 1993, Kitanidis, 1997, Wackernagel, 1998. In what follows, the second method is used.
ORDINARY KRIGING WITH MOVING NEIGHBORHOOD
THE KRIGING ESTIMATOR
A simpler version of the following ordinary kriging with moving neighborhood can be found in Chile`s and Delfiner, 1999, pp. 201–210, Kitanidis, 1997, p. 71 and Wackernagel, 1998, pp. 101–102. Denote by 1 the usual indicator function
\{x G B}
1   ifx?B, 0   otherwise.
Introduce the estimator Y
linear combination of the sample random variables
Y(u) with unknown weights A,- = A«(u) by
u) of Y(u) at u ? W as a
Y(u) = 1LXiY(ui)l{uieA(u)}
(4)
i=i
_

_
187
Braxmeier H et al: Spatial extrapolation of road traffic data
The estimation involves only the sample random variables Y(ui) such that ui is positioned in the “neighborhood” A(u) of u, i.e., ui eA(u). Being an arbitrary set, this moving neighborhood A(u) contains a priori information about the geometric dependence structure of the random field Y. For instance, it could be designed to model the formation of traffic jams. In the case of a Boolean model, this set A(u) is influenced by the shape of the primary grain. In general, A(u) can be a random closed set, i.e., A(u) =A(Z(u,co),u) where Z = {Z(u),u G R2} is a random field containing extra information about Y. Under such general assumptions on A(u), the system of linear equations on the weights Ai looks much more complicated than (7) considered below. In order to solve it, additional parameters such as crosscovariances of Y and Z should be estimated. Even in the case of uncorrelated fields Y and Z, it makes the extrapolation unnecessary complex. To avoid this, the present paper uses only deterministic sets A(u).
The normalizing condition on the weights Ai
i>
(5)
ensures the unbiasedness of the estimator given in (4). In other words, it holds
E[Y(u)]=E[Y(u)}
even if the mean of Y is not zero. Moreover, this condition makes it possible to use variograms in (7) since variograms are negative conditionally semidefinite (see, e.g., Wackernagel, 1998, pp. 52-53). The “optimality” of the estimator Y(u) means that its variance should be minimal, i.e.,
ESTIMATION OF THE VARIOGRAM
In applications, a variogram estimation method to be used should always be chosen in accordance with the data framework. The most simple and popular one is undoubtedly the estimator of Matheron (see, e.g., Chile`s and Delfiner, 1999, Wackernagel, 1998). Its drawback is sensitivity to outliers. Among robust estimation methods, the trimmed mean estimator (see, e.g., Lehmann and Casella, 1998) as well as the estimators of Cressie–Hawkins (see Cressie, 1993) and Genton (see Genton, 1998a, Genton, 2001) should be mentioned. These methods are designed for noisy data but they are biased.
Since the traffic data seem to be not contaminated with outliers, the estimator of Matheron is used here. It is defined by
7(h)
1
2N(h)
L      (Y(ui)-Y(uj))2
(8)
i,j:ui-uj?h
where ui - uj « h means that ui - uj belongs to a certain neighborhood U(h) of vector h and N(h) denotes the number of such pairs {ui ,uj) for i,j = 1,...,n. The choice of U(h) depends on the problem. In the present paper, the following segment of a circle is used:
U(h) = {x G R2:x=(\x\,(p), | \h\-\x\ \<8,\<p-<po\<e},
(9)
where (|x|, <p) and (\h\, <po) are the polar coordinates of x and h;8,e> 0. If /is continuous then the estimator in (8) is asymptotically unbiased, i.e.,
lim E
?,?›0
7(h).
E[{Y{u)-Y{u)f] —>min.
(6)
This classical minimization problem yields further conditions on Ai which can be written together with (5) in the following system of linear equations. For all i = 1,...,nwith ui G A(u) it holds
L Xjy{uj-ui)l{uj G A(u)} + ß = y(u-ui)
(7)
LAjl{ujeA(u)}
1
In order to solve this system of equations, the knowledge of the variogram function y(h) is required. However, in most practical cases y(h) is unknown and has to be estimated from the data y(ui),..., y(un).
Under further assumptions on Y such as ergodicity, it is also strongly consistent, i.e., it holds
lim   9(h) = 7(h)
N(h)^oo
almost surely.
VARIOGRAM MODELS
In practice, the estimated variogram y cannot be substituted directly for j in the system of linear equations (7). Trying this would make the numerical computation in (7) unstable because of the singularity of its coefficient matrix. Even in the case when this computation is possible its result is not correct. The reason for that is simple: f(h)   is   not  a  valid  variogram  function  since  it
_
1
_
_
_
188
Image Anal Stereol 2004;23:185-198
is not conditionally negative semidefinite. Hence, a valid parametric variogram model y (the so-called theoretical variogram) should be fitted to the empirical estimator y In the following, some valid variogram models are considered. The corresponding fitting procedures are discussed later on. A popular isotropic variogram model is the exponential one (see, e.g., Cressie, 1993, pp. 61-63, Wackernagel, 1998, pp. 244-246):
7(h)
0,                           h = 0,
a + b(1-e-\h\lc),   h^0,
where a ^ 0, b ^ 0 and c> 0 are parameters with the following geometric meaning. The value of the nugget effect a measures the discontinuity of the realizations of Y at the microscopic scale. If a > 0 then the realizations of Y are not continuous. The sill b describes the variability of the data for greater distances \h\. The third parameter c is the range of correlation of Y which implies that the random variables Y(x) and Y(x+h) are almost uncorrelated for \h\ > 3c.
A parametric variogram model y is called geometrically anisotropic if the range value c (and none of the other parameters) depends on the direction of h. If, in addition, the sill value b depends on the direction of h, the variogram is called zonally anisotropic.
As shown in Fig. 17, the traffic data lead to empirical variograms that are clearly zonally anisotropic. Below, we consider zonally anisotropic variogram models constructed from isotropic ones (see Cressie, 1993, Wackernagel, 1998). Introduce
7(h) = yi(h) + y2(h),
(10)
where ?1(h) is an exponential isotropic variogram model with nugget effect a1 > 0, sill b1 and range c1. The second term
yi(h) = b2(1-e-h Ch/c2)
(11)
is a geometrically anisotropic exponential variogram model with sill b2 > 0 and further parameter c2 > 0. Here C is the quadratic matrix of a linear transformation of the observation window, i.e.,
C = QTAQ,
where
Q=     -sinacosa                   (12)
is a rotation by the angle a around the origin and
a=(b 0 ]                   (13)
 0    X2
is a scaling transformation with scaling factors ?1, ?2 along the coordinate axes. For a vector h = (h1,h2), we have
hTCh = X2hi x (cos2 aih\
Aih + (A2-Ai)x h\h2sin(2a))
h21
Level curves of y2(h) are ellipses with main axes of polar angles a and a + n/2. The range values in these directions are equal to

(14)
Fig. 2 shows the level curves of the variogram model (10) with parameter values ax = 130, b\ = 20, c\ = 0.03, b2 = 70, c\/Xi = 109, c\/X2 = 5 • 10-5, a = 5°. Higher values of y are marked red.
Fig. 2. Zonally anisotropic variogram.
VARIOGRAM FITTING
Let y{h) be an empirical variogram estimated from the experimental data {y{ui)} for the field Y and let Yß(h) be a theoretical parametric variogram model with parameter vector
ß = (ßh...,ßk).
In the example mentioned above, we have
ß = {ahbhchb2,h/c,h/cl,a).
In practice, only a finite number m of values
y(h),...,y(hm)

_
189
Braxmeier H et al: Spatial extrapolation of road traffic data
can be computed. For two reasons, it is enough to confine computations to vectors hi of length \hi\ < diam(W)/2. First, in most cases the behavior of the variogram in a small neighborhood of the origin is decisive for the adequate choice of the model. Second, for large distances \h\ > diam(W)/2 the estimated values f(h) are contaminated by noise due to edge effects.
In order to estimate the parameter vector ß, the least-squares method is used. The generalized least-squares method (see Genton, 1998b) minimizes the following function of ß,
m
F iß) =  j   wijiYßihi) -Y{hi)){Yß{hj) -7(hj)),
where the weights can be chosen in accordance with the a priori assumptions on Y (see Cressie, 1993 for Gaussian random fields). If the distribution of Y is unknown, the classical weighting scheme can be applied:
f 1,   i = j,
wij={ 0,   i + j
with the function
F{ß) = YJ{Yß{hi)-Y{hi))2            (15)
to be minimized. In the case of traffic data, no a priori assumptions on the structure of Y have been made. Thus, the classical least-squares methods is used.
For isotropic random fields, one fits the one-dimensional curve of a parametric variogram model Yß(\h\) to an empirical one. In the anisotropic case, f(h) is computed for vectors hona square grid with m points and is fitted by a two-dimensional parametric surface yß(h), he R2. This can be done either by summing in (15) over all grid points h or only over vectors hi in a certain direction of interest <p, i.e.,
hi = \hi\(cosq>,sinq>).
Since traffic data is substantially anisotropic, the variogram model (10) has to be fitted to the data on the whole grid as well as in two directions with polar anglesaanda + 7r/2.
DRIFT ESTIMATION
The mean field {m(u)} can be estimated from the data by various methods ranging from radial
extrapolation (see, e.g., zu Castell et al, 2002 and references therein) to smoothing techniques such as moving average and edge preserving smoothing (see, e.g., Tomasi and Manduchi, 1998). In what follows, the moving average is used because of its ease and computational efficiency for large data sets.
By moving average, the value m(u) is estimated as m{u) = —   L   X(ui)                (16)
where W(u) is the “moving” neighborhood of the point u and Nu denotes the number of measurement points ui G W(u). The choice of the neighborhood W(u) is arbitrary. For fast computation, we put W(u) to be a square with side length t centered in u.
The estimator (16) yields arbitrarily smooth results for large moving neighborhoods W(u). Thus, an optimal side length t should be found to fit the problem. In the traffic problem, t must be small because edges of the surface {m(u),u G W} are intrinsic to the image structure and have to be preserved by smoothing.
In all large cities, there are areas D of parks, forests, building blocks, etc. where no road-traffic data is available. By (16), this implies m(u)=0 for all points u with W(u)cD. Consequently, such points u would automatically belong to traffic-jam regions and so contaminate traffic-jam maps with artefacts. To avoid this, the neighborhood W(u) of points u with Nu = 0 has to be enlarged till it contains at least one observation point. In this way, meaningful average velocity maps are obtained that allow the correct analysis of traffic jams.
Since X is not stationary and, consequently, m(u) is not constant the estimator (16) is biased. Nevertheless, in practical applications, the bias Em(u) - m(u) is small provided that the area \W(u)\ is small and the net of observation points is spatially dense enough.
RESIDUALS FORMED WITH ESTIMATED DRIFT
In previous sections, it has been assumed that the drift m(u) was explicitly known. If it has to be estimated from the data, the theoretical background for the application of the kriging method breaks down.
Indeed, kriging requires intrinsic stationarity of the field of residuals Y?(u) introduced in (2). This
190
16
Image Anal Stereol 2004;23:185-198
requirement is clearly not satisfied even in the case of an unbiased estimator m(u) since the variogram
f (h) = -E [Y*(u)-Y*(u + h)}2
is not equal any more to the variogram y(h) of Y (see Chiles and Delfiner, 1999, pp. 122-125, Cressie, 1993 p. 72, Wackernagel, 1998, p. 214) and depends clearly on u.
Despite these theoretical obstacles, practitioners continue to use the ordinary kriging of residuals with estimated drift based on the data y*(ut) = x(u) -m(ui), i=1,...,n legitimized by its ease and satisfactory results.
ALGORITHMSAND IMPLEMENTATION IN JAVA
In the following, some efficient algorithms for spatial extrapolation are discussed. Their implementation in Java was integrated into the GeoStoch library GeoStoch, 2004 as a separate package. The software is supplied with detailed comments generated by Java-Doc complying with the Sun standards; see Niemeyer and Peck, 1996, pp. 80-81.
FAST ESTIMATION OF VARIOGRAMS AND DRIFTS
Matheron’s estimator (8) requires all pairs of positions u and uj with ui-uj G U(h) to be found for each lattice vector h. For k lattice vectors and n positions, it costs kO in2) operations. By means of the binary search tree structure DTree, this complexity can be significantly reduced.
Such algorithm tessellates the searching space into rectangles and saves positions of actual measurements in a binary tree. Thus, searching k points from p costs in average k + log(p) operations; see Segewick, 1992. Since k < p always holds, the average complexity of the search is O (log (p)). Additionally, the complexity of filling the tree with values is O{plog{p)). For variogram estimation, one stores
p = ^^ polar coordinates of the vectors between any two measurement points in a DTree. Thus, the overall complexity for the variogram computation is O(plog(p)) + kO(log(p)). For large square lattices with side length m > 200 (k = m2), the difference in run times is significant!
The DTree structures can be used also for the fast  computation  of the  moving  average.  There,
measurement points lying in a certain square neighborhood should be found. The complexity of such computation can be estimated as mentioned above.
VARIOGRAM FITTING
In variogram fitting, one employs essentially known algorithms for the minimization of functions. The idea of all stochastic algorithms lies in cleverly modifying parameters of the variogram model at random till the maximal quadratic distance to the empirical curve becomes smaller than a critical value ?. This can be done for instance by means of genetic algorithms (see Goldberg, 1989) or the method of simulated annealing; see, e.g., Press et al., 2002, pp. 448–460. Genetic algorithms were implemented in Java and integrated in the GeoStoch library. The simulated annealing Java package JSimul is available from Me´gnin, 2001. Additionally, one-dimensional variogram fitting by slices was implemented in Java by Faulkner, 2002. This Java package provides good GUI but poor runtime performance for large data samples.
TEST EXAMPLE: BOOLEAN MODEL
To test the performance of the above extrapolation method, one needs to generate synthetic data whose theoretical properties are known. In other words, one has to find a random field {X(u)} with known structure of distribution, variogram and shape of realizations that is easy to simulate. In the following, we construct such a random field on the basis of the so-called Boolean random field, a model that is classical in stochastic geometry.
DEFINITION AND PROPERTIES OF THE SIMULATION MODEL
In what follows, basic properties of the Boolean model in R2 are described. For more details, see, e.g., Stoyan et al., 1995. Let
<P = {X1,X2,X3,...}
be a stationary Poisson point process in R2 with intensity ?; see Fig. 3.
191
Braxmeier H et al: Spatial extrapolation of road traffic data
Fig. 3. Realization of<&.
For simulation of Poisson processes, see, e.g., Lantuejoul, 2002, Stoyan et al, 1995. A Boolean random set S with deterministic grains can be introduced as
S=U(S0 + Xi) i=1
where S0 is the so-called primary grain and S0 + Xi a grain translated to the germ position Xi.
Fig. 4. Realization of S.
The primary grain S0 can be an arbitrary compact set in R2. In the present paper, a rectangle
E0 = [0,a]x[0,b]                         (17)
with width a > 0 and height b > 0 is considered; see Fig. 4. On the basis of S, one constructs a stationary random field Y = {Y(u),u G R2} by setting
Y(u) = l{ueZ}-p
where the constant p = 1 - e-Xab is the volume fraction of S. This random field is a special case of a Boolean random function considered, e.g., in Serra, 1988. It can take only values - p or 1 - p. The field Y is stationary
of order two and it holds EY(u) = 0. So it can be used to model the “deviations from the mean”.
Fig. 5. Variogram y (level curves).
The variogram y(h) of Y is given by
y(h) = e-Xab (1 -e-Mab -s0n(s0-h)lA .
For a vector h = {h1,h.2), the area |S0 n (S0 - h)| is equal to (a - |h1|)(b - | h 2|) for |h 1| ^ a, | h 2| ^ b, and zero, otherwise. This variogram is clearly anisotropic as shown in Fig. 5 for parameter values a = 40, b = 20 and X = 0.0006 in the observation window W = [0,200]2.
In order to model a non-stationary field X, one adds a deterministic drift variable m{u) to the field Y{u). As a toy example, m{u) is chosen here to be the indicator function
m(u) = l{u e Br(u0)}                 (18)
of a deterministic circle Br(u0) with center u 0 and radius r > 0; see Fig. 6. The resulting field
X(u)=m(u) + Y(u),    ueR2
attains only three values -p, 1 - p, 2 - p.
Fig. 6. Realization ofX.
To test the extrapolation quality on synthetic data, one simulates X and measures its realization x{u) at
192
Image Anal Stereol 2004;23:185-198
a finite number of points ui. Then one extrapolates X from the data x(ui) and compares the result with the original realization x(u). Measurement points ui are generated by an independent Poisson process ?1 with intensity ?1 = 0.01; see Fig. 7.
Fig. 7. Realization of ?1.
The intensity of ?1 is substantially higher than that of ? since otherwise the information contained in the data is insufficient to reconstruct the original image.
SYNTHETIC DATA
Practically, the experiment described above should be repeated many times in order to reduce the randomness in the quality of results. In this paper, 90 realizations of X have been sampled. They yield 90 data sets each of them containing ca. 300 pairs (ui,x(ui)). These data sets correspond to the traffic data of a half an hour. The intensity of ?1 is chosen to produce in average about 300 measurement points to comply with the real traffic situation.
For simulations, we used the following parameter values:
W = [0,200]2, u0 = (100,100), r = 30, a = 40, b = 20, ?= 0.0006.
The mean area fraction is then
p = 0.38121662.
In Fig. 4, a realization of ? with these parameter values is shown. By adding a circle in the middle of the picture and subtracting p, one obtains a realization of the random field X; see Fig. 6.
NUMERICAL RESULTS; RECONSTRUCTION OF SIMULATED IMAGES
To estimate the drift, moving average with the side length ?= 3 of the square neighborhood was used.
Fig. 8. Estimated drift m(u).
As seen in Fig. 8, the estimated drift preserves the original drift structure.
After subtracting the estimated drift m(u) from the data in each data set j, j = 1,...,90, the empirical variogram y* of Y* is computed; see Fig. 9. The parameter values of the circular segment (9) are 8 = 2,
L = 3°.
Fig. 9. Estimated variogram f(h) (level curves).
Then, one averages the variogram over all 90 estimated copies 7* by arithmetic mean:
This mean variogram can be well-fitted by the true variogram of the Boolean model. For fitting, simulated annealing was used to minimize the target function (15) in the least squares method.
193
Braxmeier H et al: Spatial extrapolation of road traffic data
The parameters of the simulated annealing are chosen as follows: maximal temperature 106, annealing rate 20, number of iterations 10, tolerance value 10-5; see Press et al, 2002, for their meaning. The starting values of the variogram parameters were a 0 = 20, b0 = 10, A0 = 0.006. The fitting yields parameter values
a = 39.7605124, b = 20.7768498, A =0.001193
lying quite close to the original ones. The maximal (mean) deviation of f from f is 0.03684976 (6.337388 • 10-5, respectively); see Fig. 11.
Fig. 10. Fitted variogram model ??(h) (level curves).
Fig. 11. Difference between the fitted theoretical model f(h) and the empirical variogram f{h).
The knowledge of the grain shape (17) can be integrated in the indicator functions of the kriging with moving neighborhood. Put the set {u«• G A(u)} in (4) to be equal to
{|x-xi|^a,|y-yi|^b}
where u, = x ,y,) and u = (x,y) denote the Euclidean coordinates of points u, and u.
Fig. 12. Residual Y*(u).
Thus, the extrapolation method will use only those points u that can potentially affect the value Y*{u). The extrapolation results Y*(u) and X(u) are shown in Figs. 12 and 13. The striking similarity of the images for X(u) and X(u) in Figs. 6 and 13, respectively, is a clear evidence for the high quality of the extrapolation method.
STATISTICAL TESTS FOR THE AREA FRACTION
The threshold image of Y* in Fig. 14 is a binary image that can be compared with the original image of Y in Fig. 4. Written in terms of functions, it is equal to \{u G Š} where Š = {u : Y*(u) ^ 1/2-p} and 1/2-p = 0.11878339181. To quantify visual similarities in both images, statistical tests for the area fraction can be used, see Bo¨hm et al, 2004. For each of 90 threshold images, the null hypothesis H0:p = pis tested vs. its alternative H1:p^p where
^_ |šnW| p~   | W|
Fig. 13. Extrapolated field X(u).
194
Image Anal Stereol 2004;23:185-198
is an estimator of the area fraction of the threshold image and
p
0.38121660819385905
the area fraction of the original Boolean model. If the threshold image is a realization of a Boolean model and the null hypothesis H0 is true the corresponding test statistic
N(0,1)
—
V
W\(p-p)
E iWniW-h C 1 h)
\h\^b
is  asymptotically N(0,1)-distributed  as   |W| -»¦ °° where
C.,    |Šn(Š-h)nWn(W-h)|    ^
(_,1 ( h J   =   ----------------------------------------------  — p 2
\Wn(W-h)\ is a consistent estimator of the covariance function
C 1 (h) =P{oeZ,heZ)-P2{o e Š)
of the random sets.
Fig. 14. The threshold image S of Y*
Thus, the null hypothesis H0 is rejected at the asymptotic significance level 1 - 0 if
\T\>z 1-B/2,
where z 1-e/2 is the (1 - 0/2)-quantile of the standard normal distribution.
For 0 = 0.04 and z1-e/2 = 2.054, the null hypothesis H0 was rejected in 6% to 10% of realizations depending on the series of the 90 images. It attests statistically the visual similarity of the images of S and Š. The test results can be improved by
choosing larger observation windows (e.g., 400 x 400 pixels), smaller grains (e.g., a = 20, b = 10) and more measurement points per image (say, 2000). The reason for that is the asymptotic nature of the test. The significance level is approximately equal to 1 - 0 if W is large enough, i.e., beginning from a particular relation between the sizes of grains and the observation window. Additionally, we suppose that increasing the number of measurement points would improve the extrapolation quality and consequently reduce the rejection rate of H0 to 4%. However, in our experiments we kept the number of approx. 300 measurement points constant in order to preserve analogies to the traffic problem setting.
ANALYSIS OF TRAFFIC DATA
In what follows, the above extrapolation method is applied to real traffic data.
The original data set contains entries with spatial positions scattered not only over Berlin but also over a wide region with radius of approx. 100 km from the city center. To avoid too large inhomogeneities, the observation window is reduced to downtown Berlin with geographic coordinates
13.3 ^x^ 13.46666,
52.48333 ^^ 52.55 .
Then, the data analysis is performed for the directional sector S2 = {a:n/2^a<n} including data of taxis moving northwest. This partial data set contains 19699 entries collected over 90 days (see Fig. 15).
To calculate the mean velocity field m(u), the moving average in (16) is applied to the data of sector 2 (see Fig. 16). The side length of the square moving neighborhood is t = 0.005. In Figs. 15 and 16, the northwest movement direction of the taxis can be clearly recognized. Color gradations reflect speed variation from green and blue for high values through yellow for the middle ones up to red for the low ones. Figure 16 shows the corresponding mean field m(u).
The comparison of both maps confirms that the estimator m preserves the spatial velocity structure of the data. To estimate the variogram f of Y*, the mean values m(u) have to be subtracted from the actual velocity values. Then, the empirical variogram y* is calculated for each day i = 1,..., 90.
T
_
195
Braxmeier H et al: Spatial extrapolation of road traffic data
			'¦'-^.:		\ ¦' ¦				
									
			e-" ^  ;'-		v				' ¦
			¦':.		\.			i	¦'
			- \		¦ '¦:% :¦¦•				¦ .
	,--*....¦   *"*	'J"H>^.	.......--..,		ID -^>				'.;.
			t   _.....,		,V.,^..V    ,    ^				
			¦" -\. - "¦		&."	'"'V...... .			:
			' •: ¦'^.;-';.:. ...:.		^¦¦-......,„		"'¦" >;'->¦,..		
Fig. 15. Positions of taxis moving northwest.
Fig. 16. Mean field m(u) of data set 2.
Averaging on all days, one obtains the following variogram estimator for Y?
f(h)
1   90 90
1=1
see Fig. 17.
The parameters of the segment in (9) used for variogram calculation are 8 = 0.006 and e = 3° with maximal distance h = 0.07 being approximately a half diameter of W. The empirical variogram f(h) with maxima in northwest direction and minima in orthogonal direction is zonally anisotropic showing substantial northwest correlation in the data.
In Fig. 17, level curves are colored in accordance with the increasing variogram values from green, blue and yellow to red, where the zonally anisotropic behavior of f(h) near the origin becomes clear. The variogram values are low in a narrow sector at the polar angle of approximately 170°, i.e., traffic velocities are highly correlated in this direction.
Fig. 17. Empirical variogram f{h) (level curves).
Fig. 18. Fitted variogram model ??(h) (level curves).
The zonally anisotropic variogram model (10) with two fixed parameters ? = 170?, ?1/c22 = 1000 taken from Fig. 17 has been fitted to the empirical one; see Fig. 18. The classical least squares fitting method applied to one-dimensional vertical slices of the empirical variogram in orthogonal directions ? = 80? and ?= 170? yields other parameter values:
_
196
Image Anal Stereol 2004;23:185-198
a1	=   31.77189640437076,
bi	=    116.21092322,
c\	=   245388.67081,
b2	=    22.6344102,
f2/c 22	=   683964.79366.
Thus, the range values in directions 170? and 80? are r1 = 0.27 km and r2 = 0.162 km, respectively. So a vehicle driving in direction ? = 170? influences only those vehicles driving behind it in the same direction at a maximal distance 3r1 = 810 m. Vehicles driving behind it in the orthogonal direction 80? are influenced up to a distance of 3r2 = 648 m.
Additionally, the two-dimensional surface of the above variogram model has been fitted by least squares to the empirical one using genetic minimizing algorithms. Resulting parameter values are very close to those obtained above:
a1	=	19.379745108968454,
bi	=	95.3944270699768,
c\	=	245867.97491680854,
b2	=	9.486514644862856,
h/c 22	=	684317.2022809463,
Xi/c22	=	1023.8357320907359,
a	=	146,84°.
For extrapolation, the sample of velocities x(ui),...,x(un) (n = 223) observed on Monday, 18.02.2002 is used. Compared to the whole data set 2 representing the “past”, it is interpreted as “actual” data. The random field Y* of deviations from mean velocities is extrapolated using kriging with moving neighborhood (4) with the following indicator function
l{ui?A(u)} = l{(p(ui-u)?S2}
where (p(ui - u) is the polar angle of the vector ui - u. This assumption is rather intuitive since only those measurements with positions ui lying “ahead” of the current position u can influence its velocity value.
Extrapolated residuals Y*(u) and the resulting velocity map X(u) are shown in Figs. 19 and 20, respectively. Due to the particular asymmetric form of the indicators, the extrapolated field of residuals is strongly discontinuous. This obviously affects the geometric characteristics of X(u). Discontinuities of the realizations of X caused by the kriging with moving neighborhood are essential for precise localization of traffic-jam areas. In Fig. 21, areas with velocities X(u) < 15 kph are marked yellow. Some of these regions might be caused by traffic jams.
Fig. 19. Residual fieldY*(u).
Fig. 20. Velocity field X(u).
Fig. 21. Traffic jams: X(u) ^ 15 kph.
ACKNOWLEDGEMENT
This research has been supported by the German Aerospace Center (DLR) through research grant 931/69175067. The authors are grateful to Prof. Reinhard Ku¨hne and his co-workers from the DLR Institute of Transport Research for suggesting the problem and fruitful discussions on the subject.
197
Braxmeier H et al: Spatial extrapolation of road traffic data
REFERENCES
Bo¨hm S, Heinrich L, Schmidt V (2004). Asymptotic properties of estimators for the volume fraction of jointly stationary random sets. Statistica Neerlandica, 58:1-19.
Chile`s JP, Delfiner P (1999). Geostatistics: modelling spatial uncertainty. New York: Wiley.
Cressie NAC (1993). Statistics for spatial data. New York: Wiley.
Faulkner B (2002). Java classes for nonprocedural variogram modelling. Comput Geosci 28(3):387-97.
Genton MG (1998a). Highly robust variogram estimation. Math Geol 30:213-21.
Genton MG (1998b). Variogram fitting by generalized least squares using an explicit formula for the covariance structure. Math Geol 30:323-45.
Genton MG (2001). Robustness problems in the analysis of spatial data. In: Moore M, ed. Spatial Statistics: Methodological Aspects and Applications. Lecture Notes in Statistics. New York: Springer, 21-37.
GeoStoch (2004). Java library. University of Ulm, Department of Applied Information Processing and Department of Stochastics, http://www.geostoch.de/.
Goldberg DE (1989). Genetic algorithms in search, optimization, and machine learning. Boston: Addison-Wesley.
Heinrich L, Schmidt H, Schmidt V (2004). Limit theorems for stationary tessellations with random inner cell structures. Adv Appl Probab (to appear).
Kitanidis PK (1997). Introduction to geostatistics: applications to hydrogeology. New York: Cambridge University Press.
Klenk S, Schmidt V, Spodarev E (2004). A new algorithmic approach to the computation of Minkowski functionals of polyconvex sets. Preprint (submitted).
Lantue´joul C (2002). Geostatistical simulation: models and algorithms. Berlin: Springer.
Lehmann E, Casella G (1998). Theory of point estimation, 2nd edition. New York: Springer.
Mayer J, Schmidt V, Schweiggert F (2004). A unified simulation framework for spatial stochastic models. Simul Model Pract Th 12:307-26.
Me´gnin C (2001). Jsimul: a Java-based simulated annealing package. http://www.theblueplanet.org/ JSimul/html/JSimul readme.html.
Niemeyer P, Peck J (1996). Exploring Java. Bonn: O’Reilly.
Press WH, Teukolsky SA, Vetterling WT, Flannery BP (2002). Numerical recipes in C++. The art of scientific computing, 2nd edition. Cambridge, Mass.: Cambridge University Press.
Schmidt V, Spodarev E (2004). Joint estimators for the specific intrinsic volumes of stationary random sets. Stoch Proc Appl (to appear).
Segewick R (1992). Algorithmen in C. Bonn: Addison-Wesley.
Serra J (1988). Image analysis and mathematical morphology: theoretical advances, volume 2. London: Academic Press.
Stoyan D, Stoyan H, Jansen U (1997). Umweltstatistik. Stuttgart: Teubner.
Stoyan D, Kendall WS, Mecke J (1995). Stochastic geometry and its applications, 2nd edition. Chichester: Wiley.
Tomasi C, Manduchi R (1998). Bilateral filtering for gray and color images. In: Proceedings of the 1998 IEEE International Conference on Computer Vision, Bombay, India, 839-46.
Wackernagel H (1998). Multivariate geostatistics, 2nd edition. Berlin: Springer.
zu Castell W, Weller U, Zipprich M, Sommer M, Wehrhan M (2002). Kriging considered from the deterministic point of view. In: Bayer U, Burger H, and Skala W, eds. Terra Nostra. Schriften der Alfred-Wegener-Stiftung, volume 3, Berlin, 249-54.
198