https://doi.or g/10.31449/inf.v48i15.4646 Informatica 48 (2024) 191–206 191
Intrusion Detection System for 5G Device- to-Device Communication
T echnology in Internet of Things
Ola Malkawi
1
, W esam Almobaideen
2
, Nadeem Obaid
3
, Bassam Hammo
3
1
Amman Arab University , Jordan
2
University of Jordan , Rochester Institute of T echnology , Jordan
3
University of Jordan, Jordan
E-mail: o.malkawi@aau.edu.jo, wxacad@rit.edu, obein@ju.edu.jo, b.hammo@ju.edu.jo
Keywords: device to device communication, intrusion detection system, machine learning, classification, 5G cellular
communications
Received: Feb 1, 2023
The emer gence of Internet of Things (IoT) has raised the need for high quality communications, and high
performance networks. 5G cellular communication technology exhibits the r eadiness to pr ovide such high
quality communication channels by using various advanced technologies. Device to device communica-
tions is one of multiple technologies that have been suggested in 5G. By the employment of this technology ,
mobile devices can communicate with each other without the involvement of a base station (BS). This
can eliminate congestion, expand coverage ar ea and incr ease thr oughput. Communicating devices set
up a multi-hop path using nearby devices which act as r elaying elements, or r outers. However , the Self-
or ganizing natur e and the lack of centralized contr ol of D2D make it easier to launch multiple types of
attacks. In this paper , an intrusion detection system IDS is pr oposed using machine learning techniques.
Eight types of attacks ar e consider ed to train the system for intrusion detection, then, multiple classification
algorithms have been compar ed. Finally , a multi-objective model has been designed based on the r esults
of comparison to secur e the communication pr ocess under D2D technology . The used dataset is generated
using Network Simulator NS-2.
Povzetek: V članku je pr edstavljen sistem za odkrivanje vdor ov (IDS) v komunikacijo naprava-naprava
(D2D) v tehnologiji 5G, ki uporablja str ojno učenje za pr epoznavanje več vrst napadov .
1 Intr oduction
The massive growth in wireless communications poses
many challenges to meet users’ requirements. These re-
quirements include the transmission of lar ge data volumes,
reliable communications and small response time. The
need for these requirements increase dramatically , espe-
cially with the existence of Internet of Things (IoT) [ 27 ].
The result of the lar ge number of communicating mobile
devices is a fully overloaded, low performance or even a
dis-functioning cellular networks [ 21 ]. The next generation
of cellular networks, i.e. 5G, is a promising solution for the
growing demand on high performance networks [ 9 ] as it uti-
lizes a number of technologies including: multiple inputs
multiple outputs (MIMO), mm-W aves, small cells, beam
forming, full-duplex and device to device (D2D) communi-
cations [ 15 ] and [ 4 ], these technologies have come to fulfill
the 5G promises.
Device to device communications can provide an
ef ficient use of millimeter waves and better utilization of
the available bandwidth. W ith D2D, the communication
between two devices can be accomplished without the
need for the involvement of a BS which may involve
long distance communications. Any two devices can
communicate depending on multiple small hops instead
of two long hops, from the sender to (BS) and from (BS)
to the receiver . Therefore, a User Equipment (UE) can
either help other UEs to communicate without the need
to contact a BS, as Figure 1 shows in the communication
between devices (B) and (C), or a UE may assist another
UE to communicate with BS, as depicted in Figure
1 between device (A) and (BS), even in the case where
a UE is located out of the transmission range of the BS [ 34 ].
W e can notice that there is a lack of researches inves-
tigating security in D2D cellular networks. That is, up to
our knowledge, there is no research work that has consid-
ered security attacks resulted from the self-or ganizing na-
ture of D2D devices where no centralized point is respon-
sible for controlling communication process.Nevertheless,
there are a number of researches considered other security
problems such as [ 14 ] where a new key management ap-
proach is proposed to secure the communication process
between devices in D2D technology . However , because
there are many similarities between D2D technology and
wireless ad hoc paradigm, security studies on ad hoc can be
applied to D2D communications.
Intrusion detection in wireless environment networks be-
192 Informatica 48 (2024) 191–206 O. Malkawi et al.
Figure 1: Communication in D2D technology
comes a very challenging task, especially with the emer -
gence of the modern technologies where normal users can
initiate the cellular communication process using their or -
dinary user equipment. That is, any user can advertise any
piece of information to other users within the communi-
cation process regardless of the degree of authenticity or
honesty of that user . This can imply a lar ge amount of
illegal actions which can arise and corrupt the function-
ality of such systems. T raditional systems which depend
on pre-established rules to classify users’ actions to normal
versus malicious actions could be unable to perform ef fi-
ciently as new attacks arise constantly . The more suitable
choice is the use of data mining and machine learning tech-
niques [ 1 1 ], where data can be collected and used to train a
system how to discriminate normal behavior of a network
from that with malicious actions.
In this paper , we suggest that a moderate database
is established in each base station where the traf fic is
collected and analyzed based on a specially designed
model to classify the network behavior to either normal or
malicious. Consequently , taking the convenient procedure
to secure the network. W e have used NS-2 to simulate
the D2D environment in order to create the dataset which
contains normal network behavior as well as the behavior
of eight attacks. Five classification algorithms have been
compared to select the best classifier , including random
forest, artificial neural networks, support vector machines,
decision trees and Naïve Bayes. The suggested features
are ordered based on the importance of each feature and
have been tested to select the optimal subset of features for
the final model. After the optimal classification algorithm,
random forest, has been selected, it has been applied to
design a general model to classify new types of attacks
which have not been seen in the test dataset. Based on
classification results, the proposed model is presented,
discussed and has proved to provide a highly secured
system.
The contribution of this work is summarized as follows:
1. A new dataset is generated using Network Simulator2,
the dataset consists of 4200 instances, each instance
represents either a normal network traf fic or an at-
tacked network traf fic for a number of nodes within
two minutes.
2. Multiple classification algorithms are tested to select
the most appropriate classifier for the proposed IDS.
3. A complete intrusion detection model is proposed
based on the selected classifiers.
4. The proposed model is tested and proved to be ef fi-
cient in detecting both seen as well as unseen attacks.
This paper is or ganized as follows. A background for
D2D technology , possible attacks, data mining field and
classification algorithms is provided in Section 2 . In
Section 3 we discuss our methodology , experiment envi-
ronment. Results are presented and discussed in Section 4 .
Section 5 presents the proposed intrusion detection model.
Finally , conclusion is drawn in Section 6 .
2 Backgr ound
In this section, a brief background is provided on D2D com-
munication technology ,its relation to ad hoc networks, se-
curity of D2D devices and types of possible attacks on D2D
communication process.
2.1 D2D communications
The D2D is proposed for the first time in (3GPP Rel12)
[ 35 ], the term D2D is suggested with the title (ProServ),
which stands for Proximity based Services, and it was lim-
ited for the adjacent devices with only one hop. Afterward,
the concept of D2D has evolved with 4G (L TE) for emer -
gency services [ 19 ]. D2D can of fer many advantages in
cellular systems, this advantages include:
1. The ability to access communication services in the
case of emer gency or disaster situations [ 21 ], where
nodes can relay info without connecting the dis-
functioning cellular network.
2. Better utilization of the spectrum, by using millime-
ter waves and unlicensed spectrum [ 35 ], [ 31 ], [ 21 ]
and [ 34 ].
3. Network coverage expansion, where the communica-
tion range can be expanded without adding additional
BSs [ 35 ] and [ 21 ].
4. Optimizing Power consumption [ 15 ], [ 35 ], [ 31 ], [ 21 ]
and [ 34 ].
5. Economic benefits, [ 15 ] by reducing the cost per bit
and increasing revenue for operators [ 30 ] and [ 13 ].
Intrusion Detection System for 5G Device-to-Device… Informatica 48 (2024) 191–206 193
6. Flexibility for traf fic of floading [ 15 ].
7. Better exploitation of devices proximity [ 21 ].
8. Eliminating interference [ 35 ] and [ 25 ] due to the high
path loss, which is defined as the attenuation of elec-
tromagnetic waves during propagation through space,
path loss is considered as an advantage in D2D com-
munications because concurrent communications can
be carried out without interfering [ 4 ].
9. Eliminating congestion [ 35 ] because the traf fic is dis-
tributed rather than accumulated around BS.
10. Diminishing data loss and the need for re-transmission
which saves bandwidth [ 35 ].
These advantages implies a higher network performance
in terms of throughput [ 35 ] and [ 31 ], latency [ 15 ], sys-
tem capacity [ 12 ] and quality of service (QoS) [ 30 ], which
can provide a significant advancement to a wide variety of
uses. T o encourage cellular network users to be participants
in D2D technology , relaying D2D devices may be com-
pensated with either a financial incentive or by provision-
ing services such as security during communication opera-
tion [ 30 ].
2.2 D2D communications and ad hoc
networks
By studying the distinctions between D2D and ad hoc net-
works, we can conclude that D2D can easily operate in ad
hoc mode. The most significant dif ference between D2D
and ad hoc networks is that D2D can ask for some assis-
tance from BS in some situations such as control, synchro-
nization, path discovery [ 16 ], and resource allocation [ 18 ],
while in ad hoc there are no such centralized assistance.
Thus, D2D communication operation can be either con-
trolled by a BS, or uncontrolled where each device perform
a peer discovery .
In literature, the suitability of applying ad hoc routing
on D2D is studied in [ 21 ] by implementing both AODV
and DSDV in D2D communications. The results in [ 21 ]
have shown that using ad hoc routing protocols with D2D
is a promising approach for cellular communications. Ad-
ditionally , AODV is proved as a convenient candidate for
D2D. AODV has also shown better ener gy consumption for
lar ge scale D2D networks [ 26 ], and it has been suggested
for D2D communications in [ 1 ] and [ 16 ]. In this paper , we
have adopted AODV routing protocol to simulate D2D en-
vironment to create our own dataset for the proposed model
training and testing.
2.3 Security in D2D communications
There is a lack in existing researches on the security of
D2D communications technology . Up to our knowledge,
there is no research that has considered possible attacks
when applying D2D technology in cellular communica-
tions. Nevertheless, we cannot overlook security studies
on ad hoc networks attacks which are highly linked to
possible attacks on D2D. In this section we look over a
number of existing researches related to IDSs in ad hoc
networks. In [ 3 ], there is a summarized state of the art
of IDSs in ad hoc networks. Multiple types of IDS are
designed such as fuzzy logic based systems, and cross layer
acknowledgement based systems. One major IDS type
is the classification based IDS that depends on machine
learning techniques. W e consider this type as it is the most
related type to our approach.
In [ 3 ] multiple classifiers are mentioned such as SVM,
NN and NB, which have been proved to be the most
ef ficient classifiers. In [ 7 ] dif ferent IDSs for ad hoc
networks are investigated and classified into multiple
types. The machine learning based IDS is discussed, the
most common model for this type is Bayesian network,
fuzzy logic, NN and GA. In this research, we are going
to apply theses classifiers and compare them to select the
most appropriate one for our model.
In [ 32 ], an IDS has been proposed for wireless mesh
networks (WMN), which is a type of ad hoc networks. A
dataset has been generated using NS3. Five attacks are
tar geted by the proposed IDS. Genetic algorithms have
been used for feature selection, the main idea in [ 32 ] is that
dif ferent set of features might be beneficial for each attack.
Moreover , the proposed IDS is limited to the specified
attacks. it does not discuss if it could be deal with further
or unseen attacks. In [ 20 ] the notion of cooperative IDSs is
discussed. An optimization problem for how long an IDS
needs to remain active in mobile ad hoc network to achieve
the higher protection as well as saving battery life is
presented. In [ 29 ], an IDS is proposed based on clustering,
where a cluster head is responsible for monitoring to detect
attacks rather than individual nodes continuous monitoring
which can lead to high depletion of node’ s battery life.
In [ 23 ], a number of attacks have been considered based on
the notion of the adaptive response mechanism depending
on the fact that fixed response mechanisms have a lot of
deficiencies related to the ability of detection and power
consumption.
The use of machine learning in intrusion detection sys-
tems is also adopted in [ 5 ] where multiple classifiers are
compared to construct an IDS for small smart home net-
work, with eight connected devices, the proposed IDS has
shown a promising results for a small network, it needs to
be expanded to be applied for a wider network to prove the
feasibility of using machine learning for such networks.
Despite the fact that the communication style of ad hoc
networks is similar to D2D in many aspects, all IDSs of ad
hoc depend on that fact that there is no centralized point
to monitor traf fic except cluster heads in cluster based ad
hoc networks. Even in this type, clusters are often ordi-
194 Informatica 48 (2024) 191–206 O. Malkawi et al.
nary nodes with limited capabilities. Most researches de-
pend on host based detection which relies on the wireless
node capabilities and traf fic. However , in D2D, the main
dif ference is the presence of base station, which can act as
a centralized point and be utilized to monitor the overall
traf fic and to analyze this traf fic and identify attacks if they
occur . T able 1 summarizes the state of the art for security
in literature, and highlights the main limitations in the state
of the art. In this paper , we have considered these limita-
tions. W e have started from the security aspect by consider -
ing the most possible attacks and network variations as our
first priority , then we consider the performance of machine
learning algorithms and power consumption. W e have also
investigated the ability of the proposed IDS to detect new
attacks. Moreover , we have consider moderate to lar ge net-
work sizes.
3 Methodology
In this section, we discuss the methodology of this research
and show the steps that we have gone through to develop
our IDS. Figure 2 shows the block diagram of our method-
ology . As figure 2 shows , our methodology consists of
five stages: problem understanding, data generation, data
preparation, modeling, and evaluation. In the following
subsections we provide a description of these stages.
3.1 Pr oblem understanding
Internet of things is getting more and more acceptance and
popularity by dif ferent categories of users. The free na-
ture of (IoT) opens the door to a wide variety of attacks
and makes them very easy to be launched [ 28 ]. In cellu-
lar communication, and particularly with D2D, it is very
essential to detect these attacks in order to preserve com-
munication process functioning ef ficiently by a robust and
ef ficient IDS. T o achieve this objective, this work proposes
and design a data mining model to detect attacks as their oc-
currence in the network, and based on detection outcomes,
the system behaves relying on a predefined plan to counter
the malicious node.
3.2 Dataset generation
As D2D is an underdevelopment technology and up to our
knowledge, we cannot find a real dataset for actual traf fic.
Therefore, we have generated this dataset using Network
Simulator 2. In our conducted simulation experiments, we
have selected AODV routing protocol [ 2 ], as it has been
proved to be the most ef ficient routing protocol for D2D
technology as it has been stated in [ 21 ].
W e have conducted 4200 simulation scenario experi-
ments, each within two minutes. 50-100 nodes are de-
ployed within 1000mX1000m terrain area. Mobility speed
has been varied from 0 m/s which denotes the static, im-
movable, nodes to 12m/s speed which denotes a node mov-
ing with 43 k/h which is equivalent the speed of driving a
Figure 2: Main Steps of Proposed Methodology
car in residential quarter . T able 2 illustrates the details of
simulation environment.
Each instance of the generated dataset represents two
minutes traf fic of either 50 nodes or 100 nodes moving with
either (0-3)m/s, (3-6)m/s,(6-9)m/s or (9-12)m/s. Part of this
dataset represents normal network behaviour , the remain-
ing instances represent the traf fic with the aforementioned
attacks launched. After the simulation has been conducted
for the 4200 instances, performance has been measured in
terms of predefined metrics which are considered as dataset
features. In the next section we briefly discuss the proposed
features that have been considered as inputs to the classifi-
cation algorithm.
3.3 Featur e extraction
After the simulation have been conducted, we have pro-
posed a number of traf fic properties to be considered as in-
put features for classification algorithms.T able 3 shows our
proposed features with a brief description for each feature.
These features can be measures from trace files that have
been resulted from the simulation. T race files act as detailed
log files for all communications in a given scenario. T race
files are analyzed to calculate the aforementioned features.
Figure 3 depicts a part of the final dataset.In Figure 3 we
can see the aforementioned features of T able 3 as well as
two class labels. we have considered two class labels since
we will produce two models as we will discuss later .
3.4 Data pr eparation
As the dataset has been generated from a simulation tool,
we was able to control data format by building output fea-
tures as needed. The only pre-processing that have been
needed was dealing with missing values by deleting in-
stances which contain null or infinite values. By this step
we have our final dataset version ready to be an input to
Intrusion Detection System for 5G Device-to-Device… Informatica 48 (2024) 191–206 195
T able 1: Security in literature
Paper
T ar get NT
T ype
Goals Limitations
(Alnaghes et. al) Ad Hoc
Comparison between
existing classifiers
Do not consider security aspects.
(Just ML enhancement)
(V ijayanand et. al)
W irels Mesh
Networks
Feature Selection
Limited to 5 attacks .
Do not considers unseen attacks
(Marchang et. al) Ad Hoc
Studying how long
an IDS need to be active
Limited to enhance battery life aspects
and activation time of IDS.
(Subba et. al)
Clustered
WSNs
Battery life
IDS based on cluster heads
Cluster heads are normal nodes
which means that
monitoring can lead
to battery depletion.
(Nadeem et. al) Ad Hoc
Adaptive IDS
to save battery
Battery life is the main
consideration
rather than security itself
(Anthi et. al) Ad Hoc
Comparison of
classifiers performance
within a smart
home environment
Limited to small networks
(8 devises only)
T able 2: Simulation Environment
Simulation Parameter V alue
Simulator NS2
Routing ProtocOl AODV
T ransport Layer Protocol TCP
Simulation Duration 120 seconds
Number of UEs 50,100 nodes
Mobility Speed (0-3, 3-6, 6-9, 9-12)
T errain area 1500X1500 m2
classification algorithms. The class distribution of the fi-
nal dataset is shown in figure 4 . The next step is to per -
form feature selection, the main tar get of applying feature
selection here is to use as less features as possible to reduce
computational cost. W e have adopted a simple feature se-
lection approach using R language which is a programming
language used for statistical computations [ 8 ]. W e have uti-
lized R language to order the proposed 14 features accord-
ing to feature importance in random forest classifier and us-
ing R importance function. Figure ?? shows features rank-
ing using R importance function in random forest classifier ,
which will be discussed in details later .
3.5 Modeling
As the dataset has become ready for classification process,
we have to select the most appropriate classification algo-
rithm, we decided to compare multiple classifiers in order
to select the best one based on classification outcomes. The
compared classifiers have been chosen based on the pre-
viously designed IDSs. As denoted before, multiple clas-
sifiers are implemented in the literature such as support
vector machines, K-nearest neighbors, artificial neural net-
works, decision trees and Naive Bayes and they have been
proven to be ef ficient in developing IDSs [ 3 ]. Therefore,
we have selected theses classifiers to be compared, then we
propose to add random forest as it is considered as a promis-
ing classifier , specially in intrusion detection systems [ 24 ]
and [ 22 ]. The tar get of this step is to find the best predic-
tion model to get a high performance IDS. WEKA Environ-
ment for Knowledge Analysis version 3.8.3 has been used
to apply the aforementioned classification algorithms and
compare the results objectively .
In this research, our objective is to build two IDSs
models, the first model is the binary classification model
which classifies network traf fic to either normal or abnor -
mal where abnormal denotes the occurrence of an attack.
On the other hand, the second model tar get is to specify
attack name and type. In this paper , we will make our ex-
periments based on each of these two models separately .
Finally , we will integrate these two models into one multi-
objective IDS.
WEKA is considered as a comprehensive collection of
machine learning algorithms as well as data pre-processing
tools. It is one of most common data mining tools [ 10 ].
The main advantages of using WEKA is that it contains a
wide variety of algorithms , it provides the most necessary
performance measurements, and it has a simple graphical
user interface, which makes it easy to be utilized in data
mining researches.
3.6 Evaluation
T o evaluate the performance of of the selected classifiers to
be compared, we have considered the most common eval-
uation parameters of classification algorithms. These per -
formance metrics include: classification accuracy , sensitiv-
ity , specificity , G-means and AUC. The first four measure-
196 Informatica 48 (2024) 191–206 O. Malkawi et al.
T able 3: Description of dataset features.
Abb reviation T erm Description Range
E2E End to end delay time elapsed between send-
ing and receiving a packet
Between 0 to infinity
DUP Duplicated packets the number of packets that
have been sent more than
once
between 0 to infinity
OH Overhead the number of control pack-
ets sent
0 to infinity
SENT Sent packets the number of packets have
been initiated by all sources
0 to infinity
RCVD Received packets the number of packets have
been received by intended
destinations
0 to infinity
LOST Lost packets the number of packets sent
from their source and have
not been delivered to final
destination
0 to infinity
FWD Forwards the number of forwards for
all transmitted packets by all
intermediate nodes
0 to infinity
THRPUT Throughput the number of delivered
packets per second
0 to infinity
RET Re-transmissions the number of packets that
have been re-transmitted
based on an error
0 to infinity
PDR Delivery Ratio received packets/sent pack-
ets
0 to 1
P A TH Path Length average number of hops
from source to destination
for all transmitted packets
0 to the total number
of nodes
TIME T ime the time when the last packet
has been sent or received
0 to 120
SPEED Mobility Speed The Maximum mobility
speed of the mobile nodes
0 to 12
DENS Density Number of nodes per 1000m
X 1000m
50,100
ments are based on the confusion matrix.
4 Experiments and r esults
In this section, we discuss the experiments that we have
conducted based on the methodology described in Figure 2
and using the generated dataset. W e consider the following
experiment scenarios.
– Scenario 1 : The standard classifiers k-NN, DT , NB
and NN, RF , SVM are applied for the e ntire dataset
without feature reduction, that is, the 14 features are
considered as inputs to all classification algorithms.
Accordingly , multiple performance metrics are mea-
sured. Figure 5 depicts accuracy , recall, Gmean,
F-measure and AUC for the aforementioned classi-
fiers. From Figure 5 , we can notice that random for -
est do well in terms of all performance metrics and
it achieves the higher level of performance. WEKA
Environment for Knowledge Analysis version (3.8.3)
has been used for all comparison experiments. For
Naive Bayes, random forest, support vector machine
and J48 decision trees, we have used their Java imple-
mentations in WEKA. For k-NN, K is set to 1 as this
value produced the best output. For artificial neural
networks, the number of hidden layers is set to 2, we
have selected 2 as it is considered to be suf ficient with
simple data sets.
– Scenario 2 : After the initial evaluation of the best
classification algorithms to be used in our IDS, we
have concluded that random forest is the most appro-
priate classifier , so, we have used R language random
forest importance function to rank the dataset features.
The outcomes of the ranking process are presented in
Figure ?? . W e have considered this ranking to apply
Intrusion Detection System for 5G Device-to-Device… Informatica 48 (2024) 191–206 197
Figure 3: A shot of the generated data set
Figure 4: Class distribution of the generated data set
feature selection. W e have simply removed the x least
importance features and observe the performance of
the tar get classifiers. x has been varied to find the op-
timal number of features to be removed in order to get
the best performance. W e have started with removing
4 features and keeping 10, then we have removed 8
and 12 features to keep 6 and 2 features, respectively .
In scenario 2, the same classifiers applied in scenario
1 are also applied using the same instances and pa-
rameter settings with varying the removed features.
The tar get of the second scenario is to quantify the im-
provement in the performance of each classifier when
the number of features is reduced. Thereafter , the
main tar get is to use as less features as possible while
getting the higher performance to optimize the pro-
posed IDS in terms of computational cost. Figure 6
shows the steps of scenario 2.
4.1 Experimental setup
In this section we discuss the experiments which have
been conducted to design the final model. The aforemen-
tioned classification algorithms, namely k-NN, DT , NB,
NN , SVM and RF , are trained and tested based on 10-fold
cross validation technique. In 10-fold cross validation, the
dataset is divided into 10 equal parts, thereafter , training is
carried out on nine parts and tested on the remaining one
part. T raining and testing are repeated ten times such that
in each time the test part is changed. Finally , average of all
test results is reported.
When 10-Fold cross validation is used, we can guarantee
that the entire dataset is eventually used for both training
and testing. Moreover , we ensure that stratified sampling
is achieved by creating the 10 folds such that in each fold,
class distribution is close as possible to the dataset distri-
bution. Here, stratified sampling is very important to get
better results in terms of bias and variance [ 17 ].
4.2 Experiment I: binary classification with
all featur es dataset
In this experiment, our selected classification algorithms
are applied to the generated dataset without removing any
feature. Our tar get here is to evaluate the performance of all
classifiers to determine if there is an attack or not using all
features proposed. Results are depicted in the column chart
of Figure 5 . By examining the results, we notice that we
can achieve a very high classification accuracy under most
classifiers. W e notice also that the lowest performance clas-
sifier is NB classification algorithm, this is due to the fact
198 Informatica 48 (2024) 191–206 O. Malkawi et al.
Figure 5: Binary classification performance of classification algorithms for the entire dataset
that the proposed features cannot be independent, they al-
most depend on each other .Delivery ratio for example is de-
rived from the other two features, namely , sent and received
packets. As another example, lost is the result of abstract-
ing sent packets and received packets. In NB, the classifica-
tion is built on the assumption that features are independent
from each other [ 6 ]. As we can see, our generated dataset
violates this assumption, therefore NB classifiers have the
worst performance. In the next experiment we are going to
compare all classifiers in terms of a number of performance
metrics, however , we are going to a apply feature selection
based on the performance-wise ranking shown in Figure ??
in order to optimize performance as well as the computation
cost.
4.3 Experiment II: binary classification
with featur e selection
This experiment tar gets the process of determining if there
is an attack or not. In this experiment, to optimize the per -
formance and communication cost of the final classification
model for our IDS, first, we have applied the R performance
function order of the suggested 14 attributes. The result of
ordering process for binary classification is shown in the
left-hand table of Figure ?? . Then we have tried to remove
the least importance features, unnecessary features increase
computational cost and time and may hamper classification
process which limits performance.
T o identify the optimal number of features to be removed,
we have tried dif ferent removal ratios to get the best model.
W e experienced the performance of selected classifiers by
training them on datasets with dif ferent features, starting
with full features dataset which produced results shown in
Figure 5 . Then, we have tried to remove the lowest 4,8
and 12 features and keeping the higher 10, 6, and 2 higher
importance features, respectively . W e have measured ac-
curacy , root-mean-square-error , area under curve curve, re-
call, f-measure and g-mean. Figure 7 up to Figure 12 show
the evaluation results of this experiment.
In Figure 7 , we can see that feature reduction did not af-
fect classification accuracy significantly for all used classi-
fiers except with NB which is the only classifier that has
been improved in terms of accuracy with 40% approxi-
mately which was achieved by applying classification with
the only two higher importance features, duplication and
lost packets.
The same performance has been noticed in Figure 8 , Fig-
ure 9 , Figure 10 , Figure 1 1 and Figure 12 , for recall, f-
measure, area under curve, G-mean and root mean square
error , respectively . Results of theses figures indicate that it
is still easy for random forest, decision trees and K-nearest
neighbor classifiers to identify attacks which represent ma-
jority in the dataset.on the other hand naive Basie and sup-
port vector machine have the worst classification perfor -
mance [ 33 ].
W e have noticed that random forest classifier has
achieved the best performance in f-measure (97%), accu-
racy (95%) and AUC (98%), while it was the second best
Intrusion Detection System for 5G Device-to-Device… Informatica 48 (2024) 191–206 199
Figure 6: steps of feature selection
classifier in terms of Gmean (89%) and mean-square er -
ror(10%) metrics. W e conclude that removing the least
importance features did not significantly af fect classifica-
tion performance while it is guaranteed to limit computa-
tion cost. The aforementioned performance ratio of random
forest indicates that it can identify an attack with a ratio of
95%.
As a conclusion of this experiment, random forest model is
the best model for intrusion detection. This conclusion was
made based on the values of accuracy , which is referred to
as detection rate and considered as the most important met-
ric in intrusion detection systems. This conclusion will be
adopted in our final model.
4.4 Experiment III: attack based
classification with featur e selection
In this experiment our tar get is to exactly specify the type of
attack launched, so, we have repeated steps of experiment II
in order to optimize performance and communication cost
of the final classification model for our IDS. we have started
with applying the R performance function to order the 14
attributes. The output of the ordering process for attack
classification is shown in the right-hand table of Figure ?? .
Next, we have tried to remove the least importance fea-
tures. As in experiment II, we have tried to remove dif ferent
number of features and measuring the performance, starting
with full features dataset which results are presented in Fig-
ure 13 . Then, we have tried to remove the lowest 4,8 and
12 features and keeping the higher 10,6,and 2 higher im-
portance features, respectively . classification is applied for
the remaining features to identify the type of the attack. W e
have measured accuracy , root-mean-square-error , area un-
der curve curve, recall,f-measure and g-mean. Figure 14 up
to Figure 19 show the evaluation results of this experiment.
In Figure 14 , we can see that feature reduction did not
af fect considerably classification accuracy for the three
higher classifiers which are (RF ,KNN, and J48). These
three classifiers have shown the best accuracy , between
(75% and 85%). The remaining classification algorithms
have shown a significantly lower accuracy for attack spec-
ification (lower than 55%).
Similar performance has been noticed in Figure 15 , Fig-
ure 16 , Figure 17 and Figure 18 for recall, f-measure, area
200 Informatica 48 (2024) 191–206 O. Malkawi et al.
Figure 7: Accuracy of classification algorithms for the dif-
ferent datasets
Figure 8: Recall of classification algorithms for the dif fer -
ent datasets
Figure 9: F-measure of classification algorithms for the dif-
ferent datasets
under curve and Gmean, respectively , where the dominated
observations of these metrics for the experienced classifiers
are that RF ,KNN and J48 have the highest performance in
terms of these metrics. Random forest is the best of them
with 80%, 82%, 98% and 90% for each of recall, f-measure,
Figure 10: Area under curve of classification algorithms for
the dif ferent datasets
Figure 1 1: Gmean for classification algorithms for the dif-
ferent datasets
Figure 12: Root mean square error of classification algo-
rithms for the dif ferent datasets
area under curve and Gmean, respectively .
The next observation is that classification with all features
included has the best performance. Finally , the last obser -
vation is related to NN, which shows an improvement of
performance with feature reduction until we remove 8 fea-
tures and keep 6. W ith less than 6 features, we notice a sig-
Intrusion Detection System for 5G Device-to-Device… Informatica 48 (2024) 191–206 201
Figure 13: Attack classification performance of classifica-
tion algorithms for the entire dataset
Figure 14: Accuracy of classification algorithms for the dif-
ferent datasets
Figure 15: Recall of classification algorithms for the dif fer -
ent datasets
nificant drop of performance metrics for NN particularly .
Figure 19 shows the root mean square error metric for
experimented classification algorithms, results seem to
be dif ferent for this metric because we notice that there
are no significant improvement as features are eliminated,
moreover , random forest classifier has achieved the lower
Figure 16: F-measure of classification algorithms for the
dif ferent datasets
Figure 17: Area under curve of classification algorithms for
the dif ferent datasets
Figure 18: G-mean for classification algorithms for the dif-
ferent datasets
root mean square error with 16% using all features.
In this experiment, we can conclude that random forest is
the most suitable classifier to be considered in our proposed
IDS. In the next experiment, we are going to integrate re-
sults and conclusions of these three experiments to provide
the final model of the proposed IDS.
202 Informatica 48 (2024) 191–206 O. Malkawi et al.
Figure 19: Root mean square error of classification algo-
rithms for the dif ferent datasets
4.5 Experiment VI: detection unseen attacks
AS the number of users increases and new technologies are
taking place with the emer gence of Internet of Things, new
attacks are continuously occur . In this section, we are going
to test the ability of random forest classifier to detect new
attacks which have not been included in the training dataset.
W e have selected random forest based on the previous ex-
periments which have shown that it is the most appropriate
classification algorithm. W e have divided our dataset into
two parts, training and testing datasets. In testing dataset,
we have included instances for only two attacks, A and B,
as well as normal instances. On the other hand, the training
dataset include all attacks except A and B.
The tar get is to measure the ability of the classifier to
recognize new attacks such that it has not been trained on.
W e have tried this experiment 4 times with dif ferent values
of A and B to cover all attacks. T able 3 shows the perfor -
mance of random forest classifier during four conducted ex-
periments in terms of accuracy , recall, F-measure and area
under curve. From Figure 3 , we can notice that random
forest is able to detect 86% of unseen attacks, which is rep-
resented by the average recall metric. Detection ratio is also
near 86% which represents the classifier ’ s ability to distin-
guish normal network behaviour from attacks. F-measure
and area under curve metrics achieved 84% and 79%, re-
spectively , which are considered as acceptable detection for
unseen and emer ged attacks.
5 Intrusion detection model for D2D
communications
In this section we integrate outcomes of the conducted ex-
periments to provide a design for a complete intrusion de-
tection system for D2D communications. Figure shows
the IDS design for a cellular network. This model sug-
gests to add a spatio-temporal database system, which is
used in wireless communication networks, and only for a
short time-span within a geographic region. By adding the
spatio-temporal database to the cellular system, traf fic of
all nodes connecting to a base-station is temporally stored
in the aforementioned database. From this database, we can
extract the features of our proposed dataset. Classification
algorithm, random forest, which has been selected based
on this research can be applied periodically , e.g. every two
minutes, in the initial step, random forest performs binary
classification to determine if there is an attack. If no attack
is detected, there is nothing to do, otherwise, if an attack is
detected, a second classification is applied to determine its
type. When the attack is specified, the appropriate response
is determined by either disabling D2D communication and
returning to the usual cellular communication paradigm, or
by enabling one of the detection or mitigation techniques.
Response of detecting an attack represents a separate and
complementary part of our designed IDS.
Figure 20: Intrusion Detection Model for a cellular network
with D2D communications
6 Conclusion
According to the increasing number of internet users and
the emer gence of new technologies, the number of cyber -
security attacks increases. The existence of an intrusion
detection system becomes a high necessity . Detection and
mitigating techniques are often provided at the expense of
some cost such as, delay , additional equipment, overhead,
and so on. The employment of machine learning techniques
help to detect intrusions based on the existing knowledge
and data, and without adding any extra cost.
In this research, the tar get was to use classification al-
gorithms in intrusion detection for D2D communications.
First, we have generated our own dataset using NS-2 sim-
ulator , then, we have compared multiple classification al-
gorithms to select the most appropriate classifier to be used
in our IDS. W e have also applied a simple feature selection
based on feature importance estimated using R language.
Intrusion Detection System for 5G Device-to-Device… Informatica 48 (2024) 191–206 203
T able 4: Performance results of unseen attacks
Exp No. Unseen Attacks Accuracy Recall F-Measur e AUC
1 Rushing + W ormhole 86.30% 0.863 0.866 0.898
2
Blackhole +
Cachepoisoning
84.10% 0.841 0.864 0.631
3 Hellofloding+Jellyfish 99.60% 0.997 0.991 1
4
Cooperative BH +
Greyhole
75.20% 0.752 0.646 0.66
A verage 86.30% 86.33% 84.18% 79.73%
Figure 21: Contribution of this work as compared to SOT A
Figure 21 dpicts the main contribution of this paper as com-
pared to previous research in the SOT A (State Of The Art)
Experiments indicated that random forest is the most ap-
propriate classification algorithm to be used for our IDS. It
has proved a 97% detection rate for binary classification,
and 85% accuracy in attack type identification. Finally , we
have provided a suggested design for an IDS of a cellular
network.
Refer ences
[1] S. A. Abd, S. Manjunath, and S. Abdulhayan.
“Direct Device-to-Device communication in 5G
Networks”. In: Computation System and Infor -
mation T echnology for Sustainable Solutions
(CSITSS), International Confer ence on . doi:
10.1 109/CSITSS.2016.7779425 . IEEE. 2016,
pp. 216–219.
[2] W . Almobaideen and D. AlKhateeb. “CSPDA:
Contention and stability aware partially disjoint
AOMDV routing protocol”. In: Cr oss valida-
tion2015 IEEE Jor dan Confer ence on Applied
Electrical Engineering and Computing T echnolo-
gies (AEECT) . doi: 10.1 109/AEECT .2015.7360548 .
IEEE. 2015, pp. 1–6.
[3] M. S. Alnaghes and F . Gebali. “A Survey on Some
Currently Existing Intrusion Detection Systems for
Mobile Ad Hoc Networks”. In: The Second Inter -
national Confer ence on Electrical and Electr onics
Engineering, Clean Ener gy and Gr een Computing
(EEECEGC2015) . V ol. 12. 2015. URL: https :
/ / api . semanticscholar . org / CorpusID :
54682924 .
[4] R. I. Ansari et al. “5G D2D networks: T echniques,
challenges, and future prospects”. In: IEEE Systems
Journal (2017). doi: 10.1 109/JSYST .2017.2773633 .
[5] E. Anthi et al. “A supervised intrusion detec-
tion system for smart home IoT devices”. In:
IEEE Internet of Things Journal 6.5 (2019). doi:
10.1 109/JIOT .2019.2926365 , pp. 9042–9053.
[6] M. Bramer . Principles of data mining . V ol. 180. doi:
10.2165/00002018-200730070-00010 . Springer ,
2007.
[7] I. Butun, S. D. Mor gera, and R. Sankar . “A
survey of intrusion detection systems in wire-
204 Informatica 48 (2024) 191–206 O. Malkawi et al.
less sensor networks”. In: IEEE communi-
cations surveys & tutorials 16.1 (2014). doi:
10.1 109/SUR V .2013.0501 13.00191 , pp. 266–282.
[8] M. J. Crawley . The R book . doi:
10.1002/9781 1 18448908 . John W iley & Sons,
2012.
[9] A. Habbal, S. I. Goudar , and S. Hassan. “A Context-
aware Radio Access T echnology selection mecha-
nism in 5G mobile network for smart city applica-
tions”. In: Journal of Network and Computer Appli-
cations 135 (2019). doi: 10.1016/j.jnca.2019.02.019 ,
pp. 97–107.
[10] M. Hall et al. “The WEKA data mining software: an
update”. In: ACM SIGKDD explorations newsletter
1 1.1 (2009). doi: 10.1 145/1656274.1656278 , pp. 10–
18.
[1 1] K. M. Harahsheh and C.-H. Chen. “A survey of using
machine learning in IoT security and the challenges
faced by researchers”. In: Informatica 47.6 (2023).
doi: 10.31449/inf.v47i6.4635 .
[12] Z. Hashim and N. Gupta. “Futuristic device-to-
device communication paradigm in vehicular ad-hoc
network”. In: Information T echnology (InCIT e)-The
Next Generation IT Summit on the Theme-Internet of
Things: Connect your W orlds, International Confer -
ence on . doi: 10.1 109/INCITE.2016.7857618 . IEEE.
2016, pp. 209–214.
[13] Y . Jung, E. Festijo, and M. Peradilla. “Joint operation
of routing control and group key management for 5G
ad hoc D2D networks”. In: Privacy and Security in
Mobile Systems (PRISMS), 2014 International Con-
fer ence on . doi: 10.1 109/PRISMS.2014.6970602 .
IEEE. 2014, pp. 1–8.
[14] M. A. Kandi et al. “A versatile Key Management
protocol for secure Group and Device-to-Device
Communication in the Internet of Things”. In: Jour -
nal of Network and Computer Applications 150
(2020). doi: 10.1016/j.jnca.2019.102480 , p. 102480.
[15] U. N. Kar and D. K. Sanyal. “An overview
of device-to-device communication in cellu-
lar networks”. In: ICT Expr ess (2017). doi:
10.1016/j.icte.2017.08.002 .
[16] B. Kaufman and B. Aazhang. “Cellular net-
works with an overlaid device to device net-
work”. In: Signals, Systems and Computers, 2008
42nd Asilomar Confer ence on . doi: 10.1 109/AC-
SSC.2008.5074679 . IEEE. 2008, pp. 1537–1541.
[17] R. Kohavi et al. “A study of cross-validation and
bootstrap for accuracy estimation and model se-
lection”. In: Cr oss validation . V ol. 14. Montreal,
Canada. 1995, pp. 1 137–1 145. URL: https : / /
www . ijcai . org / Proceedings / 95 - 2 / Papers /
016.pdf .
[18] X. Lin et al. “An overview of 3GPP device-
to-device proximity services”. In: IEEE
Communications Magazine 52.4 (2014). doi:
10.1 109/MCOM.2014.6807945 , pp. 40–48.
[19] J. Liu et al. “Device-to-device communication in
L TE-advanced networks: A survey”. In: IEEE Com-
munications Surveys & T utorials 17.4 (2015). doi:
10.1 109/COMST .2014.2375934 , pp. 1923–1940.
[20] N. Marchang, R. Datta, and S. K. Das. “A novel
approach for ef ficient usage of intrusion detec-
tion system in mobile Ad Hoc networks”. In:
IEEE T rans. V ehicular T echnology 66.2 (2017). doi:
10.1 109/TVT .2016.2557808 , pp. 1684–1695.
[21] P . Masek, A. Muthanna, and J. Hosek. “Suitability
of MANET routing protocols for the next-generation
national security and public safety systems”. In:
Confer ence on Smart Spaces . doi: 10.1007/978-3-
319-23126-6
2
2 . Springer . 2015, pp. 242–253.
[22] Y . Meidan et al. “Detection of unauthorized iot
devices using machine learning techniques”. In:
arXiv pr eprint arXiv:1709.04647 (2017). doi:
10.48550/arXiv .1709.04647 .
[23] A. Nadeem and M. P . Howarth. “An intrusion
detection & adaptive response mechanism for
MANET s”. In: Ad Hoc Networks 13 (2014). doi:
10.1016/j.adhoc.2013.08.017 , pp. 368–380.
[24] F . A. Narudin et al. “Evaluation of machine learning
classifiers for mobile malware detection”. In: Soft
Computing 20.1 (2016). doi: 10.1007/s00500-014-
151 1-6 , pp. 343–357.
[25] J. Qiao et al. “Enabling device-to-device communi-
cations in millimeter -wave 5G cellular networks”.
In: IEEE Communications Magazine 53.1 (2015).
doi: 10.1 109/MCOM.2015.7010536 , pp. 209–215.
[26] S. Riaz, H. K. Qureshi, and M. Saleem. “Perfor -
mance evaluation of routing protocols in ener gy
harvesting D2D network”. In: Computing, Elec-
tr onic and Electrical Engineering (ICE Cube), 2016
International Confer ence on . doi: 10.1 109/ICE-
CUBE.2016.7495233 . IEEE. 2016, pp. 251–255.
[27] H. Saadeh et al. “Hybrid SDN-ICN Architecture De-
sign for the Internet of Things”. In: 2019 Sixth In-
ternational Confer ence on Softwar e Defined Sys-
tems (SDS) . doi: 10.1 109/SDS.2019.8768582 . IEEE.
2019, pp. 96–101.
[28] J. Sengupta, S. Ruj, and S. D. Bit. “A Comprehensive
survey on attacks, security issues and blockchain
solutions for IoT and IIoT”. In: Journal of Net-
work and Computer Applications 149 (2020). doi:
10.1016/j.jnca.2019.102481 , p. 102481.
Intrusion Detection System for 5G Device-to-Device… Informatica 48 (2024) 191–206 205
[29] B. Subba, S. Biswas, and S. Karmakar . “Intrusion de-
tection in Mobile Ad-hoc Networks: Bayesian game
formulation”. In: Engineering Science and T ech-
nology , an International Journal 19.2 (2016). doi:
10.1016/j.jestch.2015.1 1.001 , pp. 782–799.
[30] M. N. T ehrani, M. Uysal, and H. Y anikomeroglu.
“Device-to-device communication in 5G cellular
networks: challenges, solutions, and future direc-
tions”. In: IEEE Communications Magazine 52.5
(2014). doi: 10.1 109/MCOM.2014.6815897 , pp. 86–
92.
[31] M. Usman et al. “A software-defined device-to-
device communication architecture for public safety
applications in 5G networks”. In: IEEE Access
3 (2015). doi: 10.1 109/ACCESS.2015.2479855 ,
pp. 1649–1654.
[32] R. V ijayanand, D. Devaraj, and B. Kannapiran.
“Intrusion detection system for wireless mesh net-
work using multiple support vector machine clas-
sifiers with genetic-algorithm-based feature selec-
tion”. In: Computers & Security 77 (2018). doi:
10.1016/j.cose.2018.04.010 , pp. 304–314.
[33] D. W ang and G. Xu. “Research on the detection of
network intrusion prevention with SVM based op-
timization algorithm”. In: Informatica 44.2 (2020).
doi: 10.31449/inf.v44i2.3195 .
[34] L. W ei et al. “Ener gy ef ficiency and spectrum ef-
ficiency of multihop device-to-device communi-
cations underlaying cellular networks”. In: IEEE
T ransactions on V ehicular T echnology 65.1 (2016).
doi: 10.1 109/TVT .2015.2389823 , pp. 367–380.
[35] V . Y azıcı, U. C. Kozat, and M. O. Sunay . “A new
control plane for 5G network architecture with a case
study on unified handof f, mobility , and routing man-
agement”. In: IEEE communications magazine 52.1 1
(2014). doi: 10.1 109/MCOM.2014.6957146 , pp. 76–
85.
206 Informatica 48 (2024) 191–206 O. Malkawi et al.