ERK'2020, Portorož, 248-251 248
Automatic ski-jump distance measurement with convolutional neural
networks and computer vision
Matjaˇ z Kukar
1
, David Nabergoj
1
1
University of Ljubljana, Faculty of Computer and Information Science, Veˇ cna pot 113, 1000 Ljubljana, Slovenija
E-mail: matjaz.kukar@fri.uni-lj.si, david.nabergoj@student.uni-lj.si
Abstract. In ski jumping, video-assisted distance measuring is
used only at the top-level competitions (world cup, continen-
tal cup). For smaller competitions it is prohibitively expensive
and for this purpose we have developed a cost-effective sys-
tem using commercially available equipment (a surveillance
camera and an ordinary laptop computer). We further support
distance-measuring umpires by introducing fully automatic dis-
tance measuring based on convolutional neural networks and
computer vision techniques. We test our system on smaller
ski jumping hills and show that while the system cannot com-
pletely replace the human operator, it can signiﬁcantly speed-
up the distance measuring process. Preliminary experiment on
large hills conﬁrm our experiences and will require only minor
modiﬁcations for using multiple cameras.
1 Introduction
In Slovenia, there has been a signiﬁcant increase in ski jumping
popularity in recent years, a consequence of excellent compet-
itive results. At the primary level clubs reported doubled num-
bers of youngsters. This increased the burden on ski jumping
coaches, organizers and professional staff in competitions. In
our previous work [1, 2] we focused primarily on supporting
distance umpires who have a demanding, exposed role, and
their mistakes can signiﬁcantly inﬂuence the competition out-
comes. Our aim was to develop a system for supporting video-
assisted distance measuring on smaller hills with accessible
hardware requirements (a single-camera video system and a
laptop).
In this paper we upgrade the system using convolutional
neural networks and computer vision techniques in order to
provide automatic distance measurements with reasonable ac-
curracy. We evaluate the automatic measurements in ski jump-
ing competitions on small hills in regional competitions (Cockta
Cup) with respect to ofﬁcial results and show some directions
for future development and use on larger hills.
2 Methods and materials
Ski jump distance is deﬁned as a distance between the edge
of the jumping ramp and a point where both ski jumper’s skis
have touched the ground with full surface [3, article 432.1].
The middle point between both legs is used when the legs are
apart (e.g., Telemark landing style). FIS requires a jump dis-
tance accuracy of 0.5 m. There are several special cases [4] and
even on the smallest competition hills it is difﬁcult to manually
measure the exact jump distance, as landing speeds exceed 10
m/s, and the angles between the landing slope and landing tra-
jectories of ski jumpers are often minute [5]. Existing video
distance measuring systems (Swiss Timing, Ewoxx) are basi-
cally video recorders and provide no automatic aid for umpires.
Figure 1: The measurement grid. Each measurement line is
precisely calibrated to the particular hill and corresponds to a
distance (in meters).
2.1 Automatic distance measurement system
The automatic distance measurement system analyses a se-
quence of frames where the ski jumper is visible. The ski
jumper’s location is determined for each frame, the frames
without ski jumper are ignored (i.e. before of after the jump).
The landing frame is the ﬁrst frame in the sequence where the
jumper has already landed. A measurement grid is overlaid on
top of each frame. It contains lines annotated with distances
(in meters) used to estimate the distance for each pixel in the
frame. The main steps in the system are: determining the land-
ing frame, ﬁnding the jumper’s feet (a pixel) in this frame, and
determining the distance based on the feet point and the mea-
surement grid. The measurement grid is shown in ﬁgure 1. The
distance measurement system is part of the full system, which
helps the distance umpire.
2.2 Motion detection
The input to the distance measurement system is a sequence
of frames where the jumper is visible. However, the jumper
is only visible when jumping and near the landing area. Any-
thing that occurs between jumps is not relevant to the system.
We must ﬁrst extract the relevant subsequence from a typically
longer sequence of frames. We do so using an algorithm that
receives a sequence of RGB images as input and outputs a vec-
torv of equal length that tells us when a jumper is visible. The
i-th element of v
i
is equal to 1 if the jumper is visible in the
i-th frame, and 0 otherwise. The algorithm works by using
background subtraction [6] to determine the frames where mo-
tion occurs. A binary mask is generated for each frame, the
white pixels in the mask correspond to the area with motion.
The largest contour in the mask typically corresponds to the
jumper. If it is not large enough (a parameter, dependent on
249
the camera distance), we conclude that the jumper is not in
the frame. We use a median ﬁlter to remove some noise from
this contour and ﬁnally extract the smaller image of the jumper
from the axis-aligned bounding box of the processed contour
[7]. Visualization of the process is shown in ﬁgure 2.
Figure 2: The jumper localization process. The input to the
algorithm is a RGB image. It is converted to grayscale and
blurred using a Gaussian ﬁlter. A motion mask is obtained
using background subtraction. The largest moving object in
the mask corresponds to the jumper. Finally, the jumper cutout
is returned.
After the jump we get a sequence of boolean values, corre-
sponding to the jumper being in the frame. The jump is there-
fore a subsequence of frames where the corresponding boolean
value is true, meaning that the jumper is visible. There might
be some frames in between the jump where the jumper is not
detected. We allow some such frames in the jump, otherwise
the complete jump might not be correctly extracted. We uti-
lize a thresholdt = 2, which allows for at mostt consecutive
frames without the detected ski jumper in the jump sequence.
2.3 Landing detection
The next part of the measurement pipeline is detecting the
landing when it occurs. We use a convolutional neural network
that is able to receive a frame of the jump segment as input. It
activates the output0 when the jumper in the frame is in the air
and1 when he is on the ground. 0 and 1 correspond to classes
Air and Ground, respectively. We assign the landing to the ﬁrst
frame in the sequence classiﬁed as Ground. The architecture
of the neural network is shown in ﬁgure 3.
2.4 Feet point calculation
Once we know the landing frame, we can process it using com-
puter vision methods to ﬁnd the precise jump distance. The
procedure for this is two-fold: we ﬁrst ﬁnd the jumper’s feet
point and then use the measurement grid to calculate the actual
distance. We ﬁnd the feet point utilizing the mask generated in
the motion detection phase. First we use a median ﬁlter after-
wards to remove noise. The feet point is then calculated as the
intersection of the line going through the jumper’s body and
the line through their skis (ﬁgure 4. We can to do this without
a signiﬁcant loss of accuracy because of the constrained posi-
tion of the camera. The lines are found by using the RANSAC
[8] procedure to ﬁt two linear models on to motion mask. The
models are selected if their lines are sufﬁciently vertical (cor-
responding to the line going through the body) or horizontal
(corresponding to the line going through the skis). The use
of RANSAC is justiﬁed since the motion mask contains many
Input
(100x100)
Conv2D
(32@3x3)
MaxPool2D
(2x2)
Dropout
(p = 0.1)
Conv2D
(16@3x3)
MaxPool2D
(2x2)
Dropout
(p = 0.1)
Conv2D
(8@3x3)
MaxPool2D
(2x2)
Dropout
(p = 0.1)
Flatten
Dense
(16)
Output
(2)
Figure 3: The CNN architecture. The input is a 100x100
grayscale image and the output is a vector with two elements
whose values correspond to the probabilities of classifying the
image as Air or Ground.
outlier pixels which are irrelevant to the lines (e.g., pixels cor-
responding to the jumper’s hands are irrelevant to the line go-
ing through the body).
Figure 4: Feet point approximation. The blue lines correspond
to the skis and the body of the jumper. The red circle repre-
sents the approximated feet point, which is computed as the
intersection of the two lines.
2.5 Automatic measurement using measurement lines
Once the feet point is calculated, we use the measurement grid
to ﬁnd the measurement lines before above and below it. Each
measurement line has a distance associated with it (e.g. 20
meters). We use linear interpolation with respect to the clos-
est two measurement lines to calculate the precise distance at
the feet point. Formally, the distance is calculated based on
measurement line distancesp
1
andp
2
as well as the euclidean
distancesd
1
andd
2
from the feet point to these lines along the
250
grid direction vector:
x=
d
2
d
1
+d
2
  p
1
+
d
1
d
1
+d
2
  p
2
(1)
The valuesd
1
andd
2
correspond to the difference betweenT ,
P
1
, andP
2
. These points and lines are visualized in ﬁgure 5.
The grid direction vector is based on local direction vectors. A
local direction vector is computed based on two neighboring
measurement lines, which are represented with their left and
right points L
1
, R
1
and L
2
, R
2
as follows:
v
1;2
=
   !
L
1
L
2
+
   !
R
1
R
2
2
(2)
The grid direction vector is computed as the component-wise
weighted sum of local direction vectors. The grid direction
vector should be more similar to local direction vectors, which
correspond to measurement lines that are close to the feet point
than to those further away. This reduces the degree to which
imprecisely placed measurement lines affect the ﬁnal distance.
The valuesd
i
andd
i+1
denote the distance from the measure-
ment linesi andi+1 to the feet point. The weights are calcu-
lated as follows:
w
i;i+1
=e
    d
i
+d
i+1
2
  (3)
w
0
i;i+1
=
w
i;i+1
P
n  1
i=1
w
i;i+1
(4)
The weights are greater for local direction vectors, which are
closer to the feet point. We use them to compute the grid di-
rection vector:
v =
n  1
X
i=1
w
0
i;i+1
  v
i;i+1
(5)
 Figure 5: Visualization of the points and lines, used in the pre-
cise distance computation. The lines p
1
and p
2
correspond to
measurement lines with their associated distances (e.g. 10 and
12 meters). The point T corresponds to the feet of the jumper.
The measurement grid direction vector and the point T are used
to describe the line p
3
. The intersections of lines p
1
and p
2
, as
well as p
2
and p
3
are points P
1
and P
2
, which use in the linear
interpolation to obtain a more precise distance measurement.
2.6 Data
For 330 ski jumps recorded in junior competitions within PKP
[1] and
ˇ
SIPK projects [2], we acquired the ofﬁcial results (mea-
sured by umpires using the eyes-only manual method). Ski
jumps were recorded on smaller hills (HS 20-30m) and on av-
erage consisted of 36 frames recorded in HD resolution (1280  720 pixels) at 50 FPS. In order not to obstruct umpires’ view,
the camera was placed lower than the umpires. Each ofﬁcial
measurement was further augmented by a ski-jumping profes-
sional coach using manual video measurement. For testing
purposes the data was split to folds in a stratiﬁed manned, so
that all frames belonging to a ski jump were assigned to the
same fold (either for training or for testing).
3 Results
3.1 Landing frame detection results
The CNN by itself performed landing prediction relatively well.
It was compared with a naive method which always predicted
the landing to be in the middle of the sequence. The prediction
results on an independent testing set are shown in a confusion
matrix in table 1. Performing a 10-fold cross validation on the
CNN results in an average accuracy of 0.922. There are two
Predicted class
Actual class
Air Ground
Air 995 56
Ground 122 1308
Table 1: Confusion matrix for the CNN predictions.
neurons in the last layer of the network, each outputting a value
between 0 and 1. With normalization we get probabilities P
Air
and P
Ground
which correspond to the two possible classes. The
landing is determined at the ﬁrst frame where P
Air
< 0:5. We
can observe that the model often output probabilities of about
0.5 for frames around the true landing frame (ﬁgure 6).
 Video frame
 Probability
True landing
Predicted landing
Air
Ground
Figure 6: Sequence of predicted class probabilities on a test
video. The model is initially very certain that the jumper is
in the air. The value P
Ground
increases as P
Air
decreases. The
landing is predicted as soon as P
Air
ﬁrst drops below 0.5.
3.2 Determining the landing distance
We considered several scenarios in order to evaluate impor-
tant aspects of the system (table 2). First we evaluated the
landing distance computation procedure independently of the
landing frame predictions. This meant using the actual land-
ing frames in the evaluation, which yielded the mean absolute
error MAE
dist
= 0:404 m. The error is larger when the jumper
lands at the top of the landing area since the measurement lines
are placed more densely together, a consequence of the cam-
era placement very close to the hill and lower than umpires.
251
Jump distance MAE
On true landing frame (testing set) 0.404
With landing frame detection (testing set) 0.586
With landing frame detection, 10-fold CV 0.946
With landing frame detection
and bias correction, 10-fold CV 0.785
Table 2: Mean absolute error for jump distance predictions in
meters. The ﬁrst row correspond to predictions based on the
actual (true) landing frames. The second row corresponds to
full pipeline predictions, based on predicted landing frames.
The full measurement pipeline consists of the landing frame
detection (CNN) and the landing distance determination. Due
to misclassiﬁcation of landing frames the MAE
pipeline
= 0:586
m is a higher. Both numbers were obtained on an independent
testing set without problematic jumps.
We also performed a 10-fold cross validation using all 330
jumps and achieved the total error of MAE
CV
= 0:946 m. The
predicted distances (ﬁgure 8) are mostly too short, indicating
a systematic error (bias). On a typical laptop (without utiliz-
ing the GPU), the entire prediction procedure takes about 0.84
seconds. A systematic error (bias) in the system is caused be-
Figure 7: Measurement system in action at the HS=109 m hill
in Kranj (bottom) and HS=21 m hill in Mengeˇ s (top), Slovenia.
The laptop and the camera are connected using a PoE switch.
cause the camera is placed too low on the side of the landing
hill. As in World Cup competitions, we can speciﬁcally fo-
cus on jumps that are long enough (at least 17 meters in our
case). By accounting for the median error of 0.51 meters to
these predictions, the MAE
CV
is reduced to 0.785 meters. By
only analyzing the jumps where the CNN correctly predicts the
landing, we achieve the MAE of 0.25 meters.
4 Conclusion
Our evaluation shows great potential for automatic ski jumping
distance measuring. Even in current limited conﬁguration we
can measure distances with 1 m precision (at most 2 frames off,
which requires very little human intervention) in reasonable
conditions on the relevant part of the hill, both on small and
             
Predicted distance
 Actual distance
Perfect prediction 
± 1 m
Figure 8: Distribution of the predicted and the true jump dis-
tances. The diagonal (blue line) indicates perfect predictions.
Points within the green area refer to jumps where the predic-
tion error is at most 1 meter. Points above the diagonal denote
too short predictions.
large hills. While this is more than FIS-required 0.5 m, it is still
a very useful addition. There is considerable interest from ski
jumping clubs and Ski Association of Slovenia (SAS). For use
on larger hills, slight modiﬁcation of software are planned in
order to allow for several cameras. The landing detection sub-
system needs further testing under non-optimal (rain, snow)
and artiﬁcial lighting conditions. We plan to achieve these aims
in future partnership with SAS as we applied for co-founding
from the Slovenian Foundation for Sports.
References
[1] T. Ciglariˇ c et al. “Video meritve dolˇ zin smuˇ carskih skokov”. In:
Zbornik ˇ sestindvajsete mednarodne Elektrotehniˇ ske in raˇ cunalniˇ ske
konference ERK 2017 (2017), pp. 337–340.
[2] M. Kukar. “Evaluation and Prospects of Semi-Automatic Video
Distance Measurement in Ski Jumping”. In: Proceedings of the
21st International Multiconference - IS 2018 (2018), pp. 62–65.
[3] FIS: The International Ski Competition Rules (ICR). https:
//fis-ski.com. Accessed: 9. 8. 2019.
[4] FIS: Guidelines to Video Distance Measurement of Ski Jumping
2011.https://fis-ski.com. Accessed: 9. 8. 2019.
[5] N. Sato, T. Takayama, and Y . Murata. “Early Evaluation of Au-
tomatic Flying Distance Measurement on Ski Jumper’s Motion
Monitoring System”. In: 2013 IEEE 27th International Confer-
ence on Advanced Information Networking and Applications.
IEEE. 2013, pp. 838–845.
[6] Z.
ˇ
Zivkovi´ c. “Improved adaptive Gaussian mixture model for
background subtraction.” In: ICPR. 2004, pp. 28–31.
[7] G. Bradski. “The OpenCV Library”. In: Dr. Dobb’s Journal of
Software Tools (2000).
[8] M. A. Fischler and R. C. Bolles. “Random Sample Consen-
sus: A Paradigm for Model Fitting with Applications to Im-
age Analysis and Automated Cartography”. In: Commun. ACM
24.6 (1981), pp. 381–395. ISSN: 0001-0782. DOI:10.1145/
358669.358692.