https://doi.or g/10.31449/inf.v47i2.4933 Informatica 47 (2023) 295–296 295
Detecting T emporal and Spatial Anomalies in Users’ Activities for Security
Pr ovisioning in Computer Networks
Aleks Huč
University of Ljubljana, Faculty of Computer and Information Science, Ljubljana, Slovenia
aleks.huc@fri.uni-lj.si
Thesis Summary
Keywords: anomaly detection, incremental learning, unsupervised learning, clustering, adaptive windowing, profiling,
network security , network flows
Received: June 1, 2023
The paper summarizes a Doctoral Thesis that focuses on two new appr oaches for detecting anomalies
in computer networks based on network flows. The appr oaches use incr emental hierar chical clustering
algorithms and monitor changes in the data structur es to detect anomalies. Both appr oaches achieved pr e-
diction performance comparable to the state-of-the-art supervised appr oaches (F1 scor e over 0.90), even
when taking into account that our appr oaches see every data point only once and then discar d it and they
operate without the pr er equisite learning phase with labeled data.
Povzetek: Članek povzema vsebino doktorske disertacije, v kateri se osr edotočimo na dva nova pristopa
za detekcijo anomalij v računalniških omr ežjih. Pristopa temeljita na omr ežnih tokovih, inkr ementalnem
hierar hičnem gručenju in spr emljanju spr ememb v podatkovnih strukturah z namenom detekcije anoma-
lij. Oba pristopa dosežeta primerljivo stopnjo detekcije (mera F1 pr eko 0.90) v primerjavi z najnovejšimi
nadzor ovanimi metodami, tudi ko upoštevamo, da naša pristopa vidita vsak podatek le enkrat in ga nato
pozabita ter delujeta br ez pr edhodne faze učenja z označenimi podatki.
1 Intr oduction
The goal of computer network security is to provide a se-
cure environment for a computer network, its resources,
data in storage and transit and all its users [1]. Network
security starts with intrusion detection, which is defined as
a deliberate unauthorized attempt (successful or not) by an
intruder to gain access to, manipulate or misuse a computer
system or network [1]. Examples include T rojans, viruses,
malware and denial of service, brute force and probe at-
tacks.
Over the years of active development, two main cat-
egories of intrusion detection approaches have emer ged:
signature-based and anomaly-based [2]. Signature-based
approaches detect intrusions on the basis of signature
databases while anomaly-based approaches detect intru-
sions on the basis of deviations from normal activity mod-
els. V arious anomaly detection approaches have already
been proposed but have problems with today’ s dynamic
computer networks with lar ge volume and high velocity ,
variety and variability due to their use of supervised and
batch learning. Newer methods have switched to unsu-
pervised, incremental and adaptable methods to improve
upon and augment traditional approaches and provide over -
all better anomaly detection.
This paper summarizes a Doctoral Thesis [3] that pro-
vides two new approaches for improving the current state-
of-the-art anomaly detection using unsupervised, incre-
mental, adaptable and hierarchical clustering.
2 PHICAD
PHICAD (Profile- and Hierarchical Incremental
Clustering-based Anomaly Detection) is a single-layer ,
unsupervised and incremental algorithm that detects
network activity anomalies in real-time. The input is a
stream of chronologically ordered flows. The algorithm
receives a new flow and sends it to the appropriate two
profiles based on source and destination IP addresses. A
profile models the incoming and outgoing activity of an
individual network entity . The algorithm then extracts,
transforms, and normalizes the features from the flow into
a real-valued vector . The vector is then clustered inside the
appropriate profile hierarchical clustering tree structure.
The anomalies are determined in the leaf nodes where, if
the new vector is mer ged with the existing leaf, we track the
distance between the new vector and the leaf, the leaf cen-
troid changes and the leaf size; or if the new vector becomes
a new leaf, we track the distance between the new leaf and
the centroid of neighboring leaves. The predictions from all
detection mechanisms are put into a short-term model that
discards mechanisms that trigger too often and reports final
predictions.
296 Informatica 47 (2023) 295–296 A. Huč
3 PHI2CAD
PHI2 CAD (Profile- and Hierarchical Incremental T wo-
layer Clustering-based Anomaly Detection) builds upon
our single-layer PHICAD with an additional second layer
unsupervised and incremental clustering algorithm which
detects anomalies in profiles and groups of profiles.
The input into our approach is again a stream of chrono-
logically ordered flows. First, the flow is sent to the first
layer where the PHICAD algorithm creates and updates
profiles of network entities and detects network anomalies
in each individual profile separately . For each flow , the
PHICAD produces two updated profiles with predictions
for possible anomalies, one for the source and one for the
destination IP address, which are then sent to the PHI2 CAD
algorithm on the second layer .
PHI2 CAD first checks for each updated profile if it has
already been clustered into its tree data structure and if it has
been, it checks if the updated profile is still inside the leaf
or not. If it is still inside, we check for possible anoma-
lies caused by the updated profile and produce possible
anomaly predictions, by tracking the distance between the
updated profile and the leaf, the leaf centroid changes, and
the leaf size.
Otherwise, if an updated profile has not been clustered
yet or it falls outside the leaf it has previously been clustered
to, we cluster the updated profile into PHI2 CAD tree data
structure, while its previous version, if it exists, is removed
from the tree. Finally , possible anomaly predictions are de-
termined in the leaf to which the updated profile has been
clustered. If the updated profile is mer ged with the exist-
ing leaf, we track the distance between the updated profile
and the leaf, the leaf centroid changes and the leaf size; or
if the updated profile becomes a new leaf, we track the dis-
tance between the new leaf and the centroid of the neighbor -
ing leaves. The predictions from all detection mechanisms
are input into a short-term model that discards mechanisms
which trigger too often and reports the final predictions.
4 Conclusion
The goal of this dissertation was to research if we can de-
vise an anomaly detection approach with the following op-
erational constraints: incremental execution, unsupervised
learning, real-time response, ability to analyze lar ge data
sets, lightweight design and ability to adapt to changes over
time, while still providing comparable performance to clas-
sic approaches and/or providing us with additional new in-
sights.
W e have evaluated our two approaches using a state-of-
the-art data set CICIDS2017 [4] that comprises the most
common network anomalies. T o measure the predictive
performance we used standard machine learning metrics
such as precision, recall and F1 score and also the execu-
tion time against the supervised approaches. T o further ex-
plain the achieved prediction performance we analyzed the
influence of individual features on the predictions and per -
formed sensitivity analysis of the main parameters.
Our approaches can successfully detect Denial of Ser -
vice, Distributed Denial of Service, Port Scan and W eb at-
tacks when analyzing each anomaly separately and are also
able to detect anomalies even when they analyze entire data
sets with multiple types of anomalies. Performance is good
where anomalous patterns clearly dif fer from the normal
activity (F1 score over 0. 90 ), however , they have prob-
lems detecting attacks that are presented with flows that are
similar to normal flows or that are executed on higher lay-
ers of the network stack or are a part of packet payloads.
But we have to be mindful of the diminishing importance
of packet-payload analysis, due to the increasing use of
packet-payload encryption. The results were also published
in a peer -reviewed journal paper [5].
Refer ences
[1] Kizza, J. M. (2020) Guide to computer network secu-
rity , Springer .
[2] Thakkar , A. and Lohiya, R. (2021) A survey on in-
trusion detection system: feature selection, model,
performance measures, application perspective, chal-
lenges, and future research directions, Artificial Intel-
ligence Review , Springer , pp. 1–1 1 1.
[3] Huč, A. (2022) Detecting temporal and spa-
tial anomalies in users’ activities for security
pr ovisioning in computer networks , doktorska
disertacija, Ljubljana, https://repozitorij.uni-
lj.si/IzpisGradiva.php?id=137562.
[4] Sharafaldin, I. and Lashkari, A. H. and Ghorbani, A.
A. (2018) T oward Generating a New Intrusion Detec-
tion Dataset and Intrusion T raf fic Characterizationy ,
4th International Confer ence on Information Systems
Security and Privacy (ICISSP) , pp. 108–1 16.
[5] Huč, A. and T rček, D. (2021) Anomaly detection in
IoT networks: From architectures to machine learning
transparency , IEEE Access , IEEE, pp. 60607–60616.