Volume 41 Number 2 June 2017
Special Issue:
Information and Communication
Technology
Guest Editors:
Luc De Raedt
Yves Deville
Marc Bui
Dieu-Linh Truong
1977
Editorial Boards
Informatica is a journal primarily covering intelligent systems in
the European computer science, informatics and cognitive com-
munity; scientific and educational as well as technical, commer-
cial and industrial. Its basic aim is to enhance communications
between different European structures on the basis of equal rights
and international refereeing. It publishes scientific papers ac-
cepted by at least two referees outside the author’s country. In ad-
dition, it contains information about conferences, opinions, criti-
cal examinations of existing publications and news. Finally, major
practical achievements and innovations in the computer and infor-
mation industry are presented through commercial publications as
well as through independent evaluations.
Editing and refereeing are distributed. Each editor from the
Editorial Board can conduct the refereeing process by appointing
two new referees or referees from the Board of Referees or Edi-
torial Board. Referees should not be from the author’s country. If
new referees are appointed, their names will appear in the list of
referees. Each paper bears the name of the editor who appointed
the referees. Each editor can propose new members for the Edi-
torial Board or referees. Editors and referees inactive for a longer
period can be automatically replaced. Changes in the Editorial
Board are confirmed by the Executive Editors.
The coordination necessary is made through the Executive Edi-
tors who examine the reviews, sort the accepted articles and main-
tain appropriate international distribution. The Executive Board
is appointed by the Society Informatika. Informatica is partially
supported by the Slovenian Ministry of Higher Education, Sci-
ence and Technology.
Each author is guaranteed to receive the reviews of his article.
When accepted, publication in Informatica is guaranteed in less
than one year after the Executive Editors receive the corrected
version of the article.
Executive Editor – Editor in Chief
Matjaž Gams
Jamova 39, 1000 Ljubljana, Slovenia
Phone: +386 1 4773 900, Fax: +386 1 251 93 85
matjaz.gams@ijs.si
http://dis.ijs.si/mezi/matjaz.html
Editor Emeritus
Anton P. Železnikar
Volaričeva 8, Ljubljana, Slovenia
s51em@lea.hamradio.si
http://lea.hamradio.si/˜s51em/
Executive Associate Editor - Deputy Managing Editor
Mitja Luštrek, Jožef Stefan Institute
mitja.lustrek@ijs.si
Executive Associate Editor - Technical Editor
Drago Torkar, Jožef Stefan Institute
Jamova 39, 1000 Ljubljana, Slovenia
Phone: +386 1 4773 900, Fax: +386 1 251 93 85
drago.torkar@ijs.si
Contact Associate Editors
Europe, Africa: Matjaz Gams
N. and S. America: Shahram Rahimi
Asia, Australia: Ling Feng
Overview papers: Maria Ganzha, Wiesław Pawłowski,
Aleksander Denisiuk
Editorial Board
Juan Carlos Augusto (Argentina)
Vladimir Batagelj (Slovenia)
Francesco Bergadano (Italy)
Marco Botta (Italy)
Pavel Brazdil (Portugal)
Andrej Brodnik (Slovenia)
Ivan Bruha (Canada)
Wray Buntine (Finland)
Zhihua Cui (China)
Aleksander Denisiuk (Poland)
Hubert L. Dreyfus (USA)
Jozo Dujmović (USA)
Johann Eder (Austria)
George Eleftherakis (Greece)
Ling Feng (China)
Vladimir A. Fomichov (Russia)
Maria Ganzha (Poland)
Sumit Goyal (India)
Marjan Gušev (Macedonia)
N. Jaisankar (India)
Dariusz Jacek Jakóbczak (Poland)
Dimitris Kanellopoulos (Greece)
Samee Ullah Khan (USA)
Hiroaki Kitano (Japan)
Igor Kononenko (Slovenia)
Miroslav Kubat (USA)
Ante Lauc (Croatia)
Jadran Lenarčič (Slovenia)
Shiguo Lian (China)
Suzana Loskovska (Macedonia)
Ramon L. de Mantaras (Spain)
Natividad Martínez Madrid (Germany)
Sando Martinčić-Ipišić (Croatia)
Angelo Montanari (Italy)
Pavol Návrat (Slovakia)
Jerzy R. Nawrocki (Poland)
Nadia Nedjah (Brasil)
Franc Novak (Slovenia)
Marcin Paprzycki (USA/Poland)
Wiesław Pawłowski (Poland)
Ivana Podnar Žarko (Croatia)
Karl H. Pribram (USA)
Luc De Raedt (Belgium)
Shahram Rahimi (USA)
Dejan Raković (Serbia)
Jean Ramaekers (Belgium)
Wilhelm Rossak (Germany)
Ivan Rozman (Slovenia)
Sugata Sanyal (India)
Walter Schempp (Germany)
Johannes Schwinn (Germany)
Zhongzhi Shi (China)
Oliviero Stock (Italy)
Robert Trappl (Austria)
Terry Winograd (USA)
Stefan Wrobel (Germany)
Konrad Wrona (France)
Xindong Wu (USA)
Yudong Zhang (China)
Rushan Ziatdinov (Russia & Turkey)
 Informatica 41 (2017) 131–131 131 
Editors' Introduction to the Special Issue on ‟Information and 
Communication Technology” 
Since 2010, Symposium on Information and 
Communication Technology-SoICT has been organised 
annually. The symposium provides an academic forum 
for researchers to share their latest research findings and 
to identify future challenges in computer science. The 
best papers from SoICT 2015 have been extended and 
published in the Special issue “SoICT 2015” of 
Informatica Journal, Vol.40, No.2 (2016). In 2016, 
SoICT was held in Hochiminh city, Vietnam, during 
December 8-9th.  The symposium covered four major 
areas of research including Artificial Intelligence and Big 
Data, Information Networks and Communication 
Systems, Human-Computer Interaction, Software 
Engineering and Applied Computing. 
In 130 submissions from 20 countries, 58 papers 
were accepted for presentation at SoICT’2016. Among 
them, 6 papers were carefully selected, after further 
extension and additional reviews, for inclusion in this 
special issue.  
Paper “Improvement of Person Tracking Accuracy 
in Camera Network by Fusing WiFi and Visual 
Information” by Thi Thanh Thuy Pham, Thi-Lan Le and 
Trung-Kien Dao addresses the problem of person 
tracking in camera network. The authors assign the 
trajectory by person identity (ID) determined at each 
video frame. In order to improve the accuracy of vision-
based person tracking, authors propose a fusion scheme 
of WiFi and visual signals for person tracking. The 
fusion method allows tracking by identification in non-
overlapping cameras, with clear identity information 
taken from WiFi adapter.  
Paper “Persons-In-Places: A Deep Features Based 
Approach For Searching A Specific Person In A Specific 
Location” by Vinh-Tiep Nguyen, Thanh Duc Ngo, Minh-
Triet Tran, Duy-Dinh Le and Duc Anh Duong considers 
the problem of video retrieval with complex queries 
which simultaneously covers person and location 
information. Authors introduce a framework to leverage 
Bag-Of-Visual-Words (BOW) model and deep features 
for person-place video retrieval.  
Research in the paper “Another Look at Radial 
Visualization for Class-preserving Multivariate Data 
Visualization” was conducted by Van Long Tran. Radial 
visualization is one of common information visualization 
concepts for visualizing multivariate data. However, 
radial visualization may display different information 
about structures of multivariate data. For example, all 
points that are multiplicatives of given points may map to 
the same point in the visual space. An optimal layout of 
radial visualization is usually found by defining a 
suitable order of data dimensions on the unit circle. In 
this paper, author proposes a novel method that improves 
the radial visualization layout for cluster preservation of 
multivariate data.  
Paper “Key-Value-Links: A New Data Model for 
Developing Efficient RDMA-Based In-Memory Stores” 
by Hai Duc Nguyen, The De Vu, Duc Hieu Nguyen, 
Minh Duc Le, Tien Hai Ho and Tran Vu Pham proposes 
a new data model, named Key-Value-Links (KVL), to 
improve in-memory store using RDMA. The KVL data 
model is essentially a key-value model with several 
extensions. The model named KELI. The results of 
experiments on real-life workload indicate that KELI, 
without being applied much optimization, easily 
outperform Memcached, a popular in-memory key-value 
store, in many cases. 
Paper “Defense Strategies against Byzantine Attacks 
in a Consensus-Based Network Intrusion Detection 
System” by Michel Toulouse, Hai Le, Cao Vien Phung 
and Denis Hock is interested in a security problem. 
Although the purpose of Network Intrusion Detection 
System (NIDS) is to monitor network traffic such as to 
detect malicious usages of network facilities, NIDS can 
itself be attacked. The paper investigates such 
vulnerabilities in a recent consensus-based NIDS 
proposal. It is known that consensus algorithms are not 
resilient to compromised nodes sharing falsified 
information, i.e. they can be the targets of Byzantine 
attacks. The paper proposes two different strategies 
aiming at identifying compromised NIDS modules 
sharing falsified information. Also, a simple approach is 
proposed to isolate compromised modules, returning the 
NIDS into a non-compromised state. Validations of the 
defense strategies are provided through several 
simulations of Distributed Denial of Service attacks 
using the NSL-KDD data set.  
Paper “Emotional contagion model for group 
evacuation simulation” by Xuan Hien Ta, Benoit 
Gaudou, Dominique Longin and Tuong Vinh Ho focuses 
on fear-related emotions and their positive impact on the 
survival capabilities of human beings in case of crisis 
situations. Authors proposed a new model of emotional 
contagion based on some main findings in social 
psychology. This model was formalized mathematically, 
implemented and tested in the GAMA agent-based 
simulation platform in the context of evacuation 
simulation. Authors assessed experimentally the impact 
of three factors (emotion decay, environment, neighbors’ 
emotional contagion) on emotion dynamics at individual 
and group levels. 
       
Luc De Raedt 
Marc Bui 
Yves Deville 
Dieu-Linh Truong 
  
132 Informatica 41 (2017) 131–131 Y. Deville et al. 
 
 
Informatica 41 (2017) 133–148 133
Improvement of Person Tracking Accuracy in Camera Network by Fusing
WiFi and Visual Information
Thi Thanh Thuy Pham
Academy of People Security, Hanoi, Vietnam
E-mail: thanh-thuy.pham@mica.edu.vn
Thi-Lan Le and Trung-Kien Dao
MICA International Research Institute, Hanoi University of Science and Technology
(HUST - CNRS/UMI-2954 - Grenoble INP), Hanoi, Vietnam
E-mail: {thi-lan.le, trung-kien.dao}@mica.edu.vn
Keywords: camera, WiFi, fusion method, person tracking by identification
Received: March 29, 2017
Person tracking in camera network is still an open subject nowadays. The main challenge for this problem
is how to link exactly individual trajectories when people move in a camera FOV (Field of View) or
switch to other ones. This refers to solve the problem of person re-identification (Re-ID) in tracking
process. A popular method for this is assigning the current position with the previous one based on the
minimum distance between them. This is called as person identification by tracking. In this work, we
approach tracking by identification, which means the trajectory assignment is done by the person identity
(ID) determined at each video frame. In order to improve the accuracy of vision-based person tracking, we
focus on accuracy enhancement for person identification by adding ID of the WiFi-enabled device held by
each person. A fusion scheme of WiFi and visual signals is proposed in this work for person tracking. An
optimal assignment and Kalman filter are used in this combination to assign the position observations and
predicted states from camera and WiFi systems. The correction step of Kalman filter is then applied for
each tracker to give out state estimations of locations. The fusion method allows tracking by identification
in non-overlapping cameras, with clear identity information taken from WiFi adapter. The evaluation on
a multi-model dataset show outperforming tracking results of the proposed fusion method in comparison
with vision-based only method.
Povzetek: Opisana je metoda sledenja osebam preko kamer s pomočjo zlivanja podatkov.
1 Introduction
There have been several attempts to combine camera and
WiFi systems for indoor person tracking. A multi-modal
system is reported in [1] using WiFi-based localization
and tracking by stationary cameras. The combined sys-
tem focuses on improving the positioning accuracy and
confidence at room level. According to the authors’ as-
sessments, camera-based localization achieves higher posi-
tioning accuracy than WiFi-based system. However, blind
points, occlusions and person identification are much more
challenging for camera systems. WiFi systems give clearer
identity information because each mobile device has a
unique MAC address, but considered targets are required
to hold mobile devices during tracking. In this work, RSSI
property and fingerprinting method are used in WiFi sys-
tem to locate mobile targets. In camera-based system, fore-
ground segmentation is done by GMM (Gaussian Mixture
Model) method. The region which contains person feet is
then extracted from foreground and projected on the floor
plane. Gaussian kernels are used to model the foot region.
Each single module is executed depending on the avail-
ability of each sensor information. When both of them
appear, a combined Bayes model with the corresponding
confidence weights is done.
The authors in [2] reported another approach for object
localization fusing images and WiFi signals. The system
can be deployed in both indoor and outdoor environments.
The algorithm of PlaceEngine [3] and the modified ver-
sion of the Centroid algorithm [4] are used in this work
for WiFi-based localization. The mixture of observation
model based on Particle filter allows continuously track
targets even in case they are occluded by other objects or
temporarily disappear when moving in blind areas among
disjoint cameras.
In [5], the authors proposed to combine RGB data with
wireless signals emitted from a person’s cell phone to lo-
cate and track individuals. The authors considered a unique
MAC address of mobile device as a reliable cue of person’s
ID. Wireless data is efficiently embedded in RGB data as a
ring image, which captures radius estimation, error bounds,
and confidence level (noise detection) for each antenna. In
134 Informatica 41 (2017) 133–148 T.T.T. Pham et al.
order to improve tracking algorithm, each MAC address
is assigned to an observed tracklet and bipartite graph is
proposed for data association problem. The testing results
proved that performance of person localization and track-
ing can be improved by fusion RGB and wireless data.
In this paper, we propose a fusion method of WiFi and
camera for person localization and Re-ID in a camera net-
work. It allows to improve the vision-based person tracking
in not only one camera FOV, but also among different cam-
era FOVs by using the unique ID information from WiFi
hardware.
The rest of paper is organized as follows. In Section II,
a framework for multi-modal person tracking by fusion of
WiFi and camera is presented. Section III and Section IV
indicate each single person localization system based on
visual and WiFi signals, respectively. A combined method
of WiFi and camera is discussed in Section V. The compar-
ative evaluations are shown in Section VI. Conclusion and
future directions will be finally denoted in the last section.
2 Framework
Figure 1 shows the fusion framework for person localiza-
tion and Re-ID in non-overlapping camera networks. The
combined model is processed in the real scenario of a fully-
automated person surveillance system, which is reported in
our previous work [6].
In this system, the camera FOVs are covered by WiFi
range. This means WiFi signals are always available for
person localization, but disjointed camera shot areas cause
intermittent positioning for vision-based system. In each
camera FOV, person localization is done by three phases,
i.e., human detection, tracking and localization to output
person ID j by camera C (IDCj ) and the corresponding po-
sition (PCj ). Because WiFi range covers the camera FOVs,
so in each camera FOV, the vision-based positioning result
of person j will be combined with WiFi-based localization
result of person i (PWi , ID
W
i ) by a fusion algorithm in
order to make effective decisions about position and iden-
tity of person in environments. When people switch from
one camera FOV to another, they will be re-identified to
update the ID for each individual trajectory. The trajecto-
ries through the cameras will be also linked to show the
entire route in the environment. Addtionally, in the fusion
model, WiFi-based localization results are used to activate
the cameras which are in the positioning range returned by
a WiFi-based system. The proposed mixture model allows
to continuously localize and identify person moving in non-
overlapping camera networks.
In the proposed system, the positioning processes are
executed independently from each single model. The lo-
cations calculated from both models of WiFi and camera
are shown on the uniform coordinate system of a 2D floor
map. A fusion algorithm for person localization and Re-ID
is proposed. It is based on Kalman filter model, together
with an optimal assignment of estimated and observed lo-
cations from both models. The details for each single per-
son localization system and the proposed fusion algorithm
will be shown in the next sections.
3 Vision-based person localization
and Re-ID
Camera-based person localization and Re-ID is a process of
finding the positions and the corresponding ID of a person
when he/she moves in one camera FOV or switches from
one camera FOV to others in camera networks. It refers to
linking person trajectories in the frame sequences captured
from multiple cameras. These trajectories are then trans-
formed to real-world coordinate system by a process called
3D localization.
3.1 Person localization
A camera-based person localization system includes three
main steps of human detection, tracking and 3D localiza-
tion. For each camera FOV, human detection is executed at
each frame to output the human ROI (Region of Interest),
which is presented by a rectangular bounding box contain-
ing the person. The person position on image is defined
in this work as a middle point of the rectangle’s bottom
edge which has contact with the floor plane (see Figure
2). It is called a FootPoint position. Human tracking in
a frame sequence captured from a camera FOV is consid-
ered as FootPoint tracking. In case of multi-person track-
ing, each detected FootPoint has to be assigned with the
corresponding ID. 3D person localization is done by trans-
forming FootPoint positions to real world locations on a
predefined 2D coordinate system of the floor plane where
the person moves.
First, a combination of HOG-SVM and GMM back-
ground subtraction techniques [6] is applied for human de-
tection. In order to improve the performance of human de-
tection, shadow removal method in [6] is used as a post-
processing step for human detection.
Second, in each camera FOV, based on the detection re-
sults, FootPoint tracking is done by utilizing Kalman Filter
and Hungarian data association algorithm [7] to improve
the performance of track association. For each camera, a
grid of the floor plane where people move in the camera
FOV, namely detection grid (see Figure 3), is defined as a
function G(x, y):
G(x, y) =
{
1 if (x, y) ∈ CT ;
0 otherwise.
where CT is a threshold region bounded by a contour line
which is the border of camera FOV on the floor plane. As
each detected person is represented by a FootPoint posi-
tion, so a FootPoint position can belong to one of the posi-
tions of the detection grid where G(x, y)=1. Let (pxt, pyt)
denote the pixel coordinates of a FootPoint position at
time t in the grid, (mxt,myt) the pixel coordinates of a
Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 135
Vision-based 
Localization
Input 
frames
WiFi
signals
(
 , 
) 
(, )
WiFi-based 
Localization
(
 , 
) 
Fusion
Figure 1: Framework of person localization and Re-ID using the combined system of WiFi and camera.
Figure 2: Examples of tracking lines which are formed by
linking trajectories of corresponding FootPoint positions.
measurement in the grid, so that G(mxt,myt) = 1, and
(vxt, vyt) velocity values at time t in x and y direction.
The state vector xt of an user at time frame t can be
characterized by the corresponding FootPoint location, and
measurement vector zt are defined as:
xt = (pxt, pyt, vxt, vyt) (1)
zt = (mxt,myt) (2)
Using the state and measurement update equations of
Kalman filter, in conjunction with the initial conditions, at
each time frame, the state vector and its covariance matrix
are estimated. The 2D spatial coordinates of an estimated
state (p̂x, p̂y) (an estimated FootPoint position) refer to the
position p of the user u.
In multi-person tracking, a separate Kalman filter is ini-
tialized and models each person’s trajectory. A set Ut of
individuals and a setMt of measurements at time frame t
are defined as:
Ut = {u1, u2, .., uN} (3)
Mt = {m1,m2, ..,mL} (4)
withN is the number of people need to be tracked or track-
ers, and L is the number of available measurements at time
Figure 3: Example of a grid map and threshold region
bounded by a contour line.
t. In order to assign a person i to a measurement j, the
Hungarian method is used.
Third, in order to locate people in real world coordinate
system, we define a 2D map of the floor plane on which
people move. This map contains all considered camera
FOVs on the floor plane. We then calculate the coordi-
nates of each FootPoint position on the 2D map on the ba-
sis of camera calibration and hormography transform [8].
The trajectories for each person through cameras are then
linked by a method of wrapping multiple camera FOVs us-
ing a stereo calibration technique [9].
3.2 Person re-ID
In this paper, the person Re-ID problem is solved in the sce-
nario of tracking by identification. This means that at each
detected FootPoint position, we extract the human ROI, and
a feature descriptor is built on this region. In this work, a
robust KDES descriptor (Kernel Descriptor) which is pro-
posed in our previous work [6], and an SVM classifier are
used for person Re-ID in camera networks. The basic idea
of KDES descriptor is to compute the approximate explicit
feature map for kernel match function (see Figure 4). In
other words, the kernel match functions are approximated
136 Informatica 41 (2017) 133–148 T.T.T. Pham et al.
by explicit feature maps. This enables efficient learning
methods for linear kernels to be applied to the non-linear
kernels. Given a match kernel function k(x, y), the feature
map ϕ(.) for the kernel k(x, y) is a function mapping a
vector x into a feature space so that k(x, y) = ϕ(x)>ϕ(y).
Given a set of basis vectors B = {ϕ(vi)}Di=1, the approxi-
mation of feature map ϕ(x) can be:
φ(x) = GkB(x) (5)
where G>G = K−1BB , KBB is a D × D matrix with
{KBB}ij = k(vi, vj), and kB is a D × 1 vector with
{kB}i = k(x, vi).
Kernel Trick
KDES
),( yxk
)()(),( yxyxk T φφ≈)()( yx φϕ ≈
)( xx ϕ→
Figure 4: The basic idea of representation based on kernel
methods.
Similar to [10], three match kernel functions for gradi-
ent, color and shape are built from different pixel attributes
of gradient, color and local binary pattern (LBP). For each
match kernel, feature extraction is done at three levels:
pixel, patch and whole detected human region.
4 WiFi-based person localization
For WiFi, RSSI is the most popular attribute used in lo-
calization. However, the localization performance depends
much on how well we can model the relationship between
RSSI and the distance. Two main approaches have been
proposed to solve this: pass-loss/radio propagation model
[12, 13] and fingerprinting method [14]. The first one is
still an open subject, because it is not easy to have an op-
timal model for relationship between RSSI and distance.
The second one is time and workforce consuming but it is
effective for localization, especially when the probabilistic
methods are applied.
In this work, both of radio propagation model and fin-
gerprinting method for WiFi-based localization are ap-
proached. A probabilistic propagation model (PPM) in
[11], together with a new-defined radio map in fingerprint-
ing database are used. The radio propagation model reflects
the complex nature of indoor environments by taking into
account the obstacles, such as walls and floors to model
the relationship between RSSI value and the distance to a
reference point (RP). The model is based on the empirical
equation of radio-frequency signal strength in indoor envi-
ronments and its uncertainty is considered by probabilistic
characteristics. An optimization process based on genetic
algorithm is also applied to tune system parameters for best
fitting with the devices in use. Based on the probabilis-
tic propagation model, the distance between a mobile user
and APs is calculated. In fingerprinting database, a new ra-
dio map of distance features instead of RSSI values is de-
fined in order to make the radio map more reliable and sta-
ble, with lower cost for setting and updating. Additionally,
KNN matching method is applied with an additional coef-
ficient reflecting temporal changes of fingerprinting data in
environments. The flowchart of the proposed WiFi-based
person localization system is illustrated in Figure 5, with
two main phases of training and testing. The first phase is
Radio Map
RP
Coordinates
Fingerprint
Database
RSSI
values PPM
Distance
values
Offline training phase
Online testing phase
SERVER
Distance values
Mobile
User
Position
RSSI PPM
KNN 
matching
Figure 5: Diagram of the proposed WiFi-based object lo-
calization system.
processed off-line with radio maps are constructed to make
fingerprint database. Normally, a radio map contains RP
coordinates and corresponding RSSI values from available
APs. However, in our proposed system, RSSI values are
replaced by distance values. A distance value is defined as
the distance di(L) from the ith RP to the Lth AP in range
(see Figure 6) which is calculated from RSSI observations
by using the PPM model. In the testing phase, a mobile de-










AP2
RP3RP2RP1
AP3AP1
Figure 6: An example of radio map with a set of pi RPs
and the distance values di(L) from each RP to L APs.
vice continuously scan signals from nearby APs and sends
corresponding RSSI values to a server. These values are
then transformed to distance values by a proposed proba-
bilistic propagation model. Distance matchings are done
with fingerprint database by methods of KNN to find the
best candidates for mobile user location.
Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 137
4.1 Probabilistic propagation model
The probabilistic propagation model which is formed by a
deterministic model in Eq. 6 and a probabilistic model.
P = P0 − 10nlog(
r
r0
)− kd
nw∑
i=1
di
cosβi
(6)
where nw is the number of walls and floors in the middle
of the AP and the receiver, di is the thickness of the ith
wall/floor, βi is the angle of arrival corresponding to the
ith wall/floor, and kd is an attenuation factor per wall/floor
thickness unit, as illustrated in Figure 7.
Figure 7: WiFi signal attenuation through walls/floors.
The deterministic model in Eq. 6 does not consider the
uncertainty of RSSI values at a distance, so a probabilistic
model (Eq. 7) is proposed. In reality, given RSSI P , the
distance r might not be exactly the value calculated from
Equation 6, but it is within a range around this value, which
is denoted by r̄. To be more precise, r̄ will be the nominate
value of the distance r with the highest probability. Given
a RSSI P , the distribution of the distance is assumed to
follow a normal (or Gaussian) distribution with median r̄:
ρ(r, P ) = Pr(r|P ) =
1
σ
√
2π
e
−(r−r̄)2
2σ2 (7)
where σ is a standard deviation, which is also a function
of P . For simplicity, σ is assumed to be related to r̄ by a
linear relation:
σ = kσr̄ (8)
In the proposed probabilistic propagation model, there
are totally five parameters to be determined: P0, r0, n, kd
and kσ. Excepting k0, other parameters can be estimated
separately from individual measurements in a straightfor-
ward manner. However, the values of these parameters
can be slightly affected by the assumptions taken in the
RF (Radio Frequency) propagation model. For this reason,
a genetic algorithm (GA) [15] is used to find the optimal
parameter set, all together. Genetic algorithms are global
search techniques modeled after the natural genetic mech-
anism to find approximate or exact solutions for optimiza-
tion and search problems. In a GA, each parameter to be
optimized is represented by a gene. Moreover, each indi-
vidual is characterized by a chromosome, which is actually
the above set of parameters awaiting optimization. To as-
sess the quality of an individual, a fitness function (objec-
tive function, or cost function) must be defined. For the
localization module, the fitness function Ψ is defined as the
root mean square of the localization error.
Ψ = (
1
N
N∑
i=1
(x̂i−xi)2+(ŷi−yi)2+(ẑi−zi)2)1/2
(9)
where N is the number of measurements, (xi, yi, zi) and
(x̂i, ŷi, ẑi) are the real and the estimated positions, respec-
tively.
4.2 Fingerprinting database and KNN
matching
Normally, a radio map in fingerprinting method is defined
as follows:
R , {(pi ,F(pi)) | i = 1 , ..,N } (10)
where pi , [px py pz]T is real world coordinates of the ith
RP and F(pi) , [ri(1) ,..,ri(n)] is the fingerprinting ma-
trix, with n being the number of training samples at each
RP. The vector ri(t),[r1i (t), .., r
L
i (t)]
T contains RSSI
values that are scanned from L APs at time t and the loca-
tion pi. By using distance feature instead of RSSI, the radio
map in Equation 10 then has a fingerprinting matrix F(pi)
, [di(1) ,..,di(n)], with a vector di(t),[d1i (t), .., d
L
i (t)]
contains distance samples di from the ith RP to L APs.
This results in a reliable and stable radio map even in case
some APs may be inactive at a certain point of time. Fur-
thermore, the cost for setting and updating the radio map
is much lower than using RSSI as usual. It is only rebuilt
when we deploy new APs and RPs or discard them from
the WiFi-based localization system.
In testing phase, the RSSI values scanned from nearby
APs by a mobile device will be converted to the corre-
sponding distance values by PPM model. They will be
compared with the training data to find the best matches.
The matching method used in this work is KNN. In KNN,
prediction for a new instance is based on its nearest neigh-
bors in the training data. There are three main ingredi-
ents associated with this method, those are (1) the similar-
ity measure (the distance measurement) between the query
patterns and training data; (2) the number of neighbors to
be taken in the prediction; (3) the weight of the neigh-
bors; Euclidean and Manhattan distances are two common
geometric measures, in which Euclidean is the most used
in WiFi-based localization system [16, 17]. In this work,
KNN method is evaluated by Euclidean measure.
In the proposed radio map, each RP is represented by
vector di(t),[d1i (t), .., d
L
i (t)]
T in L dimensional space.
In learning phase, all these training data D with their de-
pendent variables are stored. In this case, the dependent
variables are equivalent to the positions pi of RPs in the
environment. In prediction, for a new query pattern z and
for each instance d in D, the similarity between d and z is
138 Informatica 41 (2017) 133–148 T.T.T. Pham et al.
computed by Euclidean distance measure:
l(d, z) =
√√√√ n∑
i=1
(di − zi)2 (11)
A set NB(z) of the nearest neighbors of z with
|NB(z)| = k is also determined and then the estimated
location for z is calculated. To find out an optimal k, we
test on the empirical data with k in the range from 1 to 200
by an error function (12) for each k.
Ek =
√√√√ n∑
i=1
(
ŷ − y
y
)2 (12)
where ŷ is the estimated position and y is true position.
Finally, the predicted location of z is calculated by the
weighted sum of the k neighbors (13).
yz =
∑
d∈NB(z)
w(d, z)× yd∑
d∈NB(z)
w(d, z)
(13)
where w shows the weights that are chosen by (14).
w(d, z) = e−θ×l(d,z) × e−λ×|ti−t0| (14)
where θ and λ are constants used to define the curve of ex-
ponential functions; t0 belongs to the time a query instance
is captured and ti is the time of WiFi signal scanning at
each corresponding RP in training phase; l(d, z) is the dis-
similarity between a query instance and the its neighbor. In
Equation 14, beside the weight based on dissimilarity θ a
new coefficient of λ is proposed to reflect the chronologi-
cal changes of fingerprinting data in the environment. This
means the recently-updated fingerprinting data with query
instance will have higher weight than the older one.
5 Proposed fusion method
In order to improve the performance of person tracking in
camera networks, for each camera FOV, person’s locations
determined by WiFi system are optimally assigned with
positioning results from camera system. This allows to
not only maintain the high accuracy of vision-based per-
son localization, but also improve the performance of per-
son tracking in camera networks by assigning clearer ID of
WiFi adapter to each position determined by camera sys-
tem.
Algorithm 1 shows the combined method of WiFi and
camera system for people localization and identification.
At time t, on the 2D floor map, a set of position observa-
tions from WiFi system (zwi,t) or camera system (zcj,t) for
multiple targets are shown. Index i designates one among
N targets located by WiFi system, and index j refers to one
of M positions observed by camera system. We consider
recursively two consecutive observations of the localiza-
tion results from any available sensors. At time t, assuming
that we have a set of location observations coming from
WiFi system for N targets, with zwi,t = (Xwi,t, Y wi,t, IDwi,t).
If at previous time step (t-1) we get the observations
zcj,t−1 = (Xcj,t−1, Y cj,t−1) for M positions from camera
system. Without loss of generality, we can consider these
observations as the state estimations at time t-1. The pre-
diction step of the Kalman filter (KalmanPrediction)
will be applied to estimate the next state xcj,t based on
zcj,t−1. An assignment algorithm is then utilized to find
out optimal matchings between the estimated states xcj,t
from camera system with observations (zwi,t) from the WiFi
system. Considering the result Ki,t of the assignment is
the observations at the current time t, then the predicted
state xt will be corrected by KalmanCorrection step,
by which WiFi-based positions will be augmented with the
vision-based positions.
5.1 Kalman filter
In the proposed fusion algorithm, the step of state predic-
tion in Kalman filter is used to estimate the process state
at a certain time based on the position observation or mea-
surement obtained from the previous time. The correction
step of Kalman filter is done after doing optimal assign-
ment between the estimated states and the observations at
a certain time. In this case, a process state need to be esti-
mated at a certain time is defined as a position pt of a per-
son in the real world coordinate system of 2D floor map. It
is presented by a state vector xt of location coordinates pXt
and pYt on 2D floor map, together with their corresponding
velocity values vXt and vYt:
xt = (pXt, pYt, vXt, vYt) (15)
A position observation zt is then defined as follows:
zt = (mXt,mYt) (16)
By assumption of constant velocity and acceleration in
movement of people, and the position is measured n times
per second, the state equations are then defined as follows:
pXt = pXt−1 + vXt−1∆T (17)
pYt = pYt−1 + vYt−1∆T (18)
vXt = vXt−1 (19)
vYt = vYt−1 (20)
where ∆T = 1n . The state transition matrix A and the
state-measurement matrix H are then defined as:
A =

1 0 ∆T 0
0 1 0 ∆T
0 0 1 0
0 0 0 1
 , H = [1 0 0 00 1 0 0
]
Kalman-based tracking will be started after the first suc-
cessful calculated position from WiFi or camera system,
Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 139
Algorithm 1: Person tracking by fusion of position observations from WiFi and camera systems.
Input: position observations z from WiFi and camera localization systems
Output: position estimations x
1 Parameters initiation: A, H, P1, Q, R;
2 for each set of position observations z do
3 if zi,t is from WiFi location system [zwi,t = (Xwi,t, Y wi,t, IDwi,t)] then
4 if zi,t−1 is from camera location system [zcj,t−1 = (Xci,t−1, Y ci,t−1)] then
5 [xcj,t,Pt] = KalmanPrediction(A,Q,zcj,t−1,Pt−1);
6 Ki,t = Assignment(xcj,t, zwi,t);
7 [xwi,t,Pt] = KalmanCorrection(H,R,Ki,t, xt,Pt);
8 Save xwi,t as a state estimation at time t;
9 end
10 else
11 [zcj,t = (Xcj,t, Y cj,t)]
12 if zi,t−1 is from WiFi localization system [zwi,t−1 = (Xwi,t−1, Y wi,t−1, IDwi,t−1)] then
13 [xwi,t,Pt] = KalmanPrediction(A,Q,zwi,t−1,Pt−1);
14 Ki,t = Assignment(xwi,t,zcj,t);
15 [xwi,t,Pt] = KalmanCorrection(H,R,Ki,t,xt,Pt);
16 Save xwi,t as a state estimation at time t;
17 end
18 end
19 end
20 return xwi,t;
with the initial state vector x1. The initial covariance ma-
trix P1 for the initial state is:
P1 =

σ2x1 0 0 0
0 σ2y1 0 0
0 0 σ2vx1 0
0 0 0 σ2vy1

The state noise covariance matrix Q and the measurement
noise covariance matrix R are defined as:
Q =

σ2pX 0 0 0
0 σ2pY 0 0
0 0 σ2vX 0
0 0 0 σ2vY
 ,R = [σ2mX 00 σ2mY
]
where σ2 denotes deviation in centimeter from real values
of each quantity. The measurement noise refers to the noise
of calculated positions from WiFi or camera system, and
the state noise is defined according to the motion of people.
The initial covariance matrix P1 for the initial state x1, with
assumption that the calculated position has the deviation of
±5cm from real position in both X and Y directions, and
the velocity has the deviation of ±3cm. Similarly, the state
noise covariance matrix Q is set with standard deviations of
±5cm and±3cm for the determined position and its veloc-
ity, respectively. The measurement noise covariance matrix
R is described with the standard deviation of 3cm for Foot-
Point measurement in X and Y directions, and ∆T is set
to 1, meaning that the position is measured every second.
5.2 Optimal assignment
After the Kalman prediction step, we have a position esti-
mation of xcj,t or xwi,t for camera or WiFi system, respec-
tively. Considering the first case of position estimation xcj,t
at time t for camera system, it is estimated from the previ-
ous observation of vision-based location zj,t−1. Then, op-
timal assignment at time t between xcj,t and zwi,t is applied.
Assuming that the assignment of an estimated position xj
and an observation zi incurs a cost dij which is the Eu-
clidean distance between them, then the matrix DN×L of
the costs or distances between every x ∈ M and z ∈ N is
then defined as:
D =

d11 d12 ... d1N
d21 d22 ... d2N
... ... ... ...
dM1 dM2 ... dMN

where dij =
√
(Xcj −Xwi )2 + (Y cj − Y wi )2. The assign-
ment is now formulated as a linear assignment problem:
min
∑
i∈N
∑
j∈M
dijxij (21)
subject to ∑
i∈N
xij = 1 ∀j ∈M∑
j∈M
xij = 1 ∀i ∈ N
xij ≥ 0 ∀i ∈ N , j ∈M
140 Informatica 41 (2017) 133–148 T.T.T. Pham et al.
This optimal assignment is done with the following con-
straints:
– If N = M , for each pair of (xcj,t, zwi,t), we augment
the position xcj,t with the identity IDwi,t from zwi,t;
– If N > M , all unassigned zwi,t will be kept up with
their original coordinates which are computed from
WiFi-based localization system;
– If N < M , all unassigned xcj,t are considered as false
positives and will be discarded, because we assume in
the surveillance system that all people coming in the
monitoring areas hold WiFi-enabled devices and they
have checked in at the entrance.
The overall formula for these constraints is given as fol-
lows:
Ki,t =
{
(Xcj,t, Y
c
j,t, ID
w
i,t) if zwi,t is assigned;
(Xwi,t, Y
w
i,t, ID
w
i,t) otherwise.
where Ki,t denotes the association between position esti-
mations xcj,t and observations zwi,t. Each component Ki,t
is a random variable that takes its value among {0, .., N}.
Based on this association, the location information from
WiFi-based observations will be corrected according to the
positions given by the camera system, and the correspond-
ing ID from the WiFi system will be assigned. The cor-
rection step of the Kalman filter is applied to update the
predicted state by the current position observation Ki,t.
The same procedure is done for the case in which WiFi-
based location observations come before camera-based
ones, and we have optimal assignment of an estimated po-
sition xi from the WiFi system and an observation zj from
the camera system.
6 Dataset and evaluation
6.1 Testing dataset
In order to evaluate the combined algorithm for person
tracking using both WiFi and camera systems, a multi-
modal dataset with two scripts are constructed in this work.
Script 1 is set with simpler scenarios than Script 2. Two
people are involved in Script 1, with their random routes
of moving through two non-overlapping cameras. Some
inter-person occlusions appeared but not as frequently as
in Script 2. The visual data in Script 1 is used for per-
son localization and Re-ID based on camera. Script 2 con-
tains five scenarios referring to different number of people
taking part in each scenario: one person, two, three, and
five moving people. The data in Script 2 is very challeng-
ing for both WiFi-based and vision-based systems. People
move through four different cameras. Severe occlusions
happened because all people are required to move in close
proximity with a fixed route (see Figure 8). Moreover, the
similar human appearance is a challenge for visual process-
ing problems.
Figure 8: A 2D floor map of the testing environment in
Figure 9, with the routing path of moving people in testing
scenarios.
Figure 9: Testing environment.
The testing environment for building the dataset is
shown in Figure 9, with 6 access points (APs) and 4 cam-
eras are deployed in the environment. The APs are set to a
same SSID, which assures continuous connectivity for mo-
bile devices when people move from the range of one AP
to another. The WiFi range for each AP is about 30–50
meters in radius, depending on walls and obstacles in the
environment. The AP specifications are MAC address, AP
position in X , Y and Z. All APs used in the testing are
Linksys E1200 devices. A person holds a WiFi-enable de-
vice and moves in the testing environment, with a normal
velocity of 1–1.3m/s.
The time duration for each scenario is from 3 to 5 min-
utes, with about 400 RSSI values are acquired from 6 APs
and average time deviation between two consecutive sam-
ples is 2 seconds. The mobile devices and cameras are time
synchronized to Internet time. This makes a synchroniza-
tion of data captured from both camera and WiFi. Basing
on this, we can compute real-world positions of a mobile
user on the 2D floor map at each time. The time stamp
for each person location calculated from camera or WiFi
system will provide the basis for processing multi-model
object localization. The WiFi data is scanned from the mo-
bile devices and stored in XML files. These devices con-
Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 141
Frame 491Frame 135 Frame 1596Frame 1145
Frame 905Frame 431 Frame 1969Frame 957
Frame 692Frame 242 Frame 1541Frame 1328
Frame 784Frame 313 Frame 2114Frame 810
Figure 10: The visual examples in Script 2. The first row contains frames for the scenario of one moving person. The
scenarios for 2, 3 and 5 moving people are shown in the second, third and fourth rows.
tinuously capture the signals from available APs in the en-
vironment. The AP specifications are saved as a record
of scanning time, MAC address, AP name, and RSSI. The
APs are distinguished by their own MAC addresses.
For visual data, we manually assign FootPoint positions
on the captured frames with the corresponding time stamps
and IDs. These positions are then automatically trans-
formed into 2D locations on the floor map by using camera
calibration and homography matrix. The person ID which
is assigned in visual data is equivalent to the ID of WiFi
adapter by predefined convention. In short, for each sce-
nario, the ground truth data is achieved and saved as XML
files which contain the following records:
– Frame number.
– Person ID.
– Coordinates of top left and bottom right positions of
the bounding box containing the person.
– The image coordinates of FootPoint position.
– The corresponding coordinates of FootPoint positions
on 2D floor map.
In case of no person detected, except frame number, all
other records are set to -1.
Figure 10 illustrates examples in Script 2. The frames
in the first row show the scenario of one moving person,
while those in the second, third and fourth rows are frames
for the scenarios of two, three and five moving people.
For WiFi data which is determined outside camera FOV,
the ground truth of person locations in these regions are
calculated by a pedestrian foot counting program. It takes
input information from the acceleration and direction sen-
sors that are available on smart phones or tablets [20]. Ba-
sically, the positions of mobile user in this region are com-
puted by the route length that user passes through marking
points or reference points. This distance is calculated by
foot counter with the average length of the foot step of each
particular person is considered. The foot counter gives the
positioning result of 5m with the deviation of 3m for the
route length of 120m. In our test, the route length outside
camera view is only about 10m. In addition, the bias for
foot counter is accumulated from time to time, so in 10m
this deviation will be 0.8m (equivalent to 8% of the route
length). This makes the deviation of 8cm per one meter la-
beled in the dataset in comparison with the truth positions.
After the step of synchronization between WiFi and vi-
142 Informatica 41 (2017) 133–148 T.T.T. Pham et al.
sual data, the interpolation method is applied to calculate
the person positions that are outside the camera field of
views.
6.2 Evaluation metrics
In order to evaluate the performance of vision-based
tracking, the metrics of Multi Object Tracking Precision
(MOTP) [18], Global Multiple Object Tracking Accuracy
(GMOTA) [19], and CMC (Cumulative Match Curve) are
utilized.
Assuming that for each time step t, a multi-person
tracker outputs a set of hypotheses {h1, .., hm} for a set
of visible people {u1, .., un}. MOTP measures the posi-
tioning error for all matched pairs of person and tracker
hypothesis on all frames. This metric is defined by:
MOTP =
∑
i,t di,t∑
t ct
(22)
where di,t is Euclidean distance between ground truth and
tracker hypothesis values for the person ith at time frame t.
In this work, it is Euclidean distance between ground truth
and tracker hypothesis of FootPoint positions. The element
ct indicates the number of matched pairs at time step t.
GMOTA is an extension of MOTA (Multiple Object
Tracking Accuracy) [18]. MOTA measures the number of
errors the tracker made in terms of false negatives (missed
detections), false positives (wrong detections), mismatches
and failure to recover tracks. This score is computed as
follows:
MOTA = 1−
∑
t(FNt + FPt + IDt)∑
t gt
(23)
where FNt is false negatives, FPt is false positive, IDt
shows the number of instantaneous identity switches, and
gt denote the number of ground truth detections at time
frame t. In GMOTA score, the IDt is replaced by global
IDt (gIDt). This means that gIDt presents the perfor-
mance of the tracker in preservation of person identity as-
signments in a global manner instead of instantaneous iden-
tity assignments of MOTA.
GMOTA = 1−
∑
t(FNt + FPt + gIDt)∑
t gt
(24)
The CMC is employed as the performance evaluation
metric for vision-based person Re-ID. The CMC curve
presents the expectation of finding correct match in the top
n matches.
The accuracy of the WiFi-based localization system is
evaluated by the statistical values of maximal error, error
average, and error at reliability of 90%. Maximal error
is the maximum distance deviation in meter between the
positions determined by the system and the ground truth
positions. The error average refer to the average distance
deviation in meter between the positions determined by the
system and the ground truth positions. Error at reliability
of 90% indicates the distance deviation value in meter in
which 90% of the testing times are smaller than this value.
The performance of fusion method is evaluated in this
work by the metric of GMOTA.
6.3 Experimental results
In vision-based person localization, at each camera FOV,
person identification is done by a so-called process of iden-
tification by tracking. This means a trajectory which be-
longs to an individual in the current frame is linked to the
corresponding one from the previous frame based on an
optimal assignment of Euclidean distances between them.
However, this results in ID switches when people switch to
each others.
The proposed method in 3.2 for person Re-ID helps to
solve not only person identification in each camera FOV,
but also person Re-ID among multiple cameras by using a
robust appearance-based descriptor built on each detected
human ROI at each FootPoint position. This allows to per-
form tracking by identification. However, person identifi-
cation and Re-ID performance still need to be improved,
especially in case of inter-person occlusions and people
have similar appearances.
The proposed fusion algorithm allows adding clearer ID
information of WiFi adapter for performing tracking by
identification.
In the following sections, the testing results for WiFi-
based localization, vision-based localization and Re-ID,
fusion-based tracking are shown.
6.3.1 WiFi-based localization results
The system parameters of the WiFi-based localization
model are calculated, then based on these, the positioning
results are given out.
Firstly, the training process using GA algorithm is set
up with the configuration provided in Table 1. Using these
data, the optimal parameters are produced as in Table 2.
Parameter Value Parameter Value
Population size 20 Tolerance 10−6
Elite count 5 Selection Uniform
Crossover fraction 0.5 Crossover Scattered
Time limit No Mutation Uniform
Maximal generations No Creation population Uniform
Table 1: Genetic algorithm configuration.
Parameter Values for the first scenario Values for the second scenario
P0 -41 dBm -36.1757 dBm
n 1.1 2.2029
kσ 1.0035 m−1 5.3147 m−1
r0 5 m 2.5117 m
kd 49.23 dBm.m−1 5.1311 dBm.m−1
Table 2: Optimized system parameters for the first and the
second scenarios of testing environments.
Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 143
Fingerprint
Feature
Maximal error
(m)
Average error
(m)
Error at reliability of 90%
(m)
RSSI 6.3 1.86 2.99
Distance 6.27 1.89 2.98
Table 3: Evaluations for distance and RSSI features in case
of using coefficient λ.
Fingerprint
Feature
Maximal error
(m)
Average error
(m)
Error at reliability of 90%
(m)
RSSI 6.06 1.76 3.55
Distance 6.5 1.59 2.9
Table 4: Localization results using different features of dis-
tance and RSSI, without using coefficient λ.
Secondly, the weights of different values of θ based on
dissimilarity are given out (see Figure 11). Different values
of λ are presented in Figure 12, with λ = 0.5 × 10−6, the
influence is reduced by 3 when fingerprints is scanned from
1 month since the testing time (roughly 2.6×106 seconds).
Similarly, when fingerprints is taken from 2 months since
the testing time, the influence takes only 10% compared
with that of new fingerprints. In this work, we choose k =
9, θ = 1.1 and λ = 2× 10−6.
The radio maps and fingerprint locations in the testing
environment are shown in Figure 13a and Figure 13b. The
regions with deep pink color indicate that more APs are
available than the regions with light pink color.
The localization experiments are conducted by using fin-
gerprinting method with distance features calculated by
the proposed probabilistic propagation model. The com-
parative results are also given out for using fingerprinting
method with RSSI features. Additionally, the stability and
reliability of radio map with distance features is also con-
firmed by the evaluations with coefficient λ.
Figures 14, 15, 16 show the comparative results when
the coefficient λ is taken into account. The localization re-
sults, distribution of the localization results compared to
the real locations, and the reliability of the localization re-
sult as a function of the localization error are shown corre-
spondingly in these figures. The details for these results
are shown in Table 3. It can be seen from the experiments
that the positioning errors at reliability of 90% when using
distance features are a little bit higher than using RSSI fea-
tures. However, without using λ, the localization reliability
for RSSI features decreases, whilst it is stable for distance
features. The results for this are shown in Figures 17, 18,
21, and in Table 4, with the error at reliability of 90% is
3.55m for RSSI features, but it is 2.9m for distance fea-
tures. The above experiments show that using distance fea-
tures for fingerprint data will result in more stable and re-
liable radio maps in comparison with using RSSI features.
Moreover, this also brings lower cost for updating finger-
print data, which is considered as one of the most challeng-
ing problem of fingerprinting method in WiFi-based local-
ization.
0 2 4 6 8 10 12 14 16 18 20
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Dissimilarity (m)
W
ei
gh
t
 
 
theta=0.5
theta=0.7
theta=0.9
theta=1.1
theta=1.3
Figure 11: Weights of different values of θ based on dis-
similarity.
0 2 4 6 8 10 12 14
x 10
6
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time (s)
W
ei
gh
t
 
 
lambda=1/1000000
lambda=1/2000000
lambda=1/4000000
lambda=1/6000000
Figure 12: Weights of different values of λ based on dis-
similarity.
(a) (b)
Figure 13: (a) the radio map, with (b) 2000 fingerprint lo-
cations collected in the testing environment.
144 Informatica 41 (2017) 133–148 T.T.T. Pham et al.
Vision-based evaluations The proposed fusion algorithm
Hallway (Cam 1) Showroom (Cam 3) Hallway (Cam 1) Showroom (Cam 3)
MOTP (cm) 24.3 21.3 24.3 21.3
FN (%) 17.1 26.4 7.6 12.6
FP (%) 22.7 18.3 3.4 2.1
gID 28.3 11.6 4.9 2.3
GMOTA (%) 31.2 52.6 83.9 85.7
Table 5: The comparative results of the proposed fusion algorithm against the vision-based evaluations on testing data of
Script 1.
0 10 20 30 40 50 60
0
10
20
30
40
50
60
X (m)
Y
 (
m
)
Ground Truth path
Localization result with distance feature
Localization result with RSSI feature
Figure 14: Localization results with distance and RSSI fea-
tures when using coefficient λ.
6.3.2 Experimental results for vision and
fusion-based tracking.
The performance of vision-based person localization and
Re-ID is evaluated on Script 1 and Script 2 databases. In
addition, the comparative results gained from fusion sys-
tem of camera and WiFi are also indicated on these.
Firstly, vision-based person Re-ID evaluations are done
on Script 1 data. The human ROIs are manually extracted
from the frames captured by three non-overlapping cam-
eras: Cam 1 (hallway), Cam 2 (lobby) and Cam 3 (show-
room). The human ROIs from Cam 2 are used for training
phase (see Figure 19) and the human ROIs from Cam 1 and
Cam 3 for testing phase (see Figure 20).
We train the system with totally 10 people, including two
testing ones, by the images of human ROI extracted from
Cam 2. Figure 22 shows person recognition rates for this
experiment, with Rank 1 is 51.1%.
Table 5 shows the results for vision-based localization,
with two scenarios of Hallway (Cam 1) and Showroom
(Cam 3) are considered. MOTP evaluated on the vision-
based localization system with 24.3cm and 21.3cm for
Hallway and Showroom scenarios respectively. These val-
ues are retained for the fusion model of camera and WiFi.
X (m)
-6 -3 0 3 6
-9
-6
-3
0
3
Y
 (
m
)
Distribution error with distance feature
Distribution error with RSSI feature
Figure 15: Distribution of localization error for distance
and RSSI features when using coefficient λ.
GMOTA ratio for Hallway is better than Showroom, with
correspondingly 31.2% compared to 52%. However, by be-
ing integrated with WiFi, these values increase incredibly
to 83.9% for Hallway and 85.7% for Showroom. This re-
sulted from the sharply decreases in the rates of FN, FP and
gID in both scenarios. Additionally, in comparison with the
perfect case of manual human detection in vision-based Re-
ID, the performance of person tracking by identification is
not as good as the results from the proposed fusion algo-
rithm.
Secondly, further evaluations for the proposed fusion al-
gorithm, the experiments are done on the data of Script 2.
This dataset is very challenging compared to Script 1, be-
cause of severe occlusions and the similarity in human ap-
pearance. Moreover, people moving together in the same
route is also a challenge for WiFi-based localization.
In the experiments with this data, we use the ground
truth data of FootPoint positions and the corresponding hu-
man ROIs for testing evaluations. The parameter gID in
GMOTA metric now indicates the performance of tracker
in maintaining the person ID when he/she moves from one
camera FOV to others or re-appears in one camera FOV.
Table 6 shows the comparative results of GMOTA when
applying the fusion algorithm and Rank 1 for person Re-
ID. It should be noted that FN and FP are not included
Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 145
0 1 2 3 4 5 6 7 8
0
10
20
30
40
50
60
70
80
90
100
Error (m)
R
el
ia
bi
lit
y 
di
st
rib
ut
io
n 
(%
)
Localization reliability with distance feature
Localization reliability with RSSI feature
Figure 16: Localization reliability for distance and RSSI
features when using coefficient λ.
0 10 20 30 40 50 60
0
10
20
30
40
50
60
X (m)
Y
 (
m
)
Ground Truth path
RSSI
Distance
Figure 17: Localization results for distance and RSSI fea-
tures, without using coefficient λ.
in the testing evaluations of GMOTA because we use the
ground truth data of FootPoint positions and human ROIs.
In this case, only gID is taken into account. This means
performance of maintenance person ID in tracking now de-
pends only on the performance of WiFi-based person local-
ization. In comparison with GMOTA values from Script 1,
GMOTA figures from Script 2 are much lower. It is only
31.7% for the scenario of two moving people, 16.5% and
11.2% for scenarios of three and five moving people, re-
spectively. This can be explained that data of Script 2 is
much challenging than Script 1. People moving together in
very close proximity is not only a burden for vision-based
person localization and identification, but also for WiFi-
based person localization because of noisy WiFi data when
people are close to each other.
However, in comparison with person Re-ID by kernel
descriptor, these results are much higher. In this experi-
ments, besides the number of testing people, we train the
system with 20 other people at check-in gate for person
Re-ID. The recognition rate at Rank 1 is only 12.6% for
scenario of two moving people, which is 19.1% lower than
-9 -6 -3 0 3 6
-9
-6
-3
0
3
6
9
X (m)
Y
 (
m
)
RSSI
Distance
Figure 18: Distribution of localization error for distance
and RSSI features, without using coefficient λ.
Two people Three people Five people
GMOTA (%) 31.7 16.5 11.2
Rank 1 (%) 12.6 8.9 5.6
Table 6: The experimental results for person tracking and
person Re-ID with Script 2 dataset.
fusion-based method. Rank 1 figures for scenarios of three
and five moving people is 8.9% and 5.6%. Clearly, perfor-
mance of person Re-ID based on kernel descriptor will be
degraded in case of the similar human appearance.
From the above comparative evaluations, we can see that
by using the proposed fusion algorithm, the performance of
person tracking by identification and person Re-ID is im-
proved significantly. The vision-based person localization
with high accuracy, together with the clear ID information
from WiFi-enable device are integrated into each detected
FootPoint position. This allows to do tracking by identifi-
cation at each camera FOV, and based on this, the person
Re-ID in non-overlapping camera networks can be solved
more effectively than applying only vision-based method.
7 Conclusion
In this work, person localization and Re-ID in surveillance
regions covered by WiFi signals and disjointed FOV cam-
eras are improved by a fusion algorithm based on Kalman
filter and optimal assignment technique. This algorithm is
executed with the position observations on 2D floor map
achieved from each single system of camera or WiFi.
Evaluation on the multimodal dataset shows outperform-
ing results when the proposed fusion algorithm is applied.
The high positioning accuracy of vision-based system is
maintained in multimodal person localization system. Ad-
146 Informatica 41 (2017) 133–148 T.T.T. Pham et al.
Figure 19: Training examples of manually-extracted human ROIs from Cam 2 for person 1 (images on the left) and person
2 (images on the right).
(a)
(b)
Figure 20: Testing examples of manually-extracted human ROIs from Cam 1 (images on the left column) and Cam 3
(images on the right column) for (a) person 1 and (b) person 2.
Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 147
0 1 2 3 4 5 6
0
10
20
30
40
50
60
70
80
90
100
Error (m)
R
el
ia
bi
lit
y 
di
st
rib
ut
io
n 
(%
)
RSSI
Distance
Figure 21: Localization reliability for distance and RSSI
features, without using coefficient λ.
1 2 3 4 5 6 7 8 9 10
10
20
30
40
50
60
70
80
90
100
Rank
R
ec
og
ni
tio
n 
R
at
e
Figure 22: Person Re-ID evaluations on testing data of two
moving people.
ditionally, the fusion algorithm allows tracking by identifi-
cation and based on this person Re-ID in non-overlapping
cameras is done with clear identity information taken from
the WiFi-based system.
In the future works, some other localization techniques,
such as RFID or UWB, can be integrated into a multi-
modal system in order to improve the positioning accuracy
and person Re-ID. The fusion algorithm for person local-
ization and Re-ID is also correspondingly broaden to adapt
this addition.
Acknowledgement
This research is funded by the Vietnam National Founda-
tion for Science and Technology Development (NAFOS-
TED) under grant number 102.04-2013.32.
References
[1] Van den Berghe, Sam and Weyn, Maarten and Spruyt,
Vincent and Ledda, Alessandro (2011) Combining
wireless and visual tracking for an indoor environ-
ment, International Conference on Indoor Position-
ing and Indoor Navigation (IPIN-2011).
[2] MIYAKI, Takashi, YAMASAKI, Toshihiko, et
AIZAWA, Kiyoharu (2007) Visual tracking of pedes-
trians jointly using wi-fi location system on dis-
tributed camera network, 2007 IEEE International
Conference on Multimedia and Expo, IEEE, 2007. p.
1762–1765.
[3] Rekimoto, Jun and Shionozaki, Atsushi and
Sueyoshi, Takahiko and Miyaki, Takashi (2006)
PlaceEngine: a WiFi location platform based
on realworld folksonomy Internet conference, p.
95–104.
[4] Cheng, Yu-Chung and Chawathe, Yatin and LaMarca,
Anthony and Krumm, John (2005) Accuracy charac-
terization for metropolitan-scale Wi-Fi localization,
Proceedings of the 3rd international conference on
Mobile systems, applications, and services, ACM, p.
233–245.
[5] Alahi, Alexandre and Haque, Albert and Fei-Fei, Li
(2015) RGB-W: When Vision Meets Wireless, Pro-
ceedings of the IEEE International Conference on
Computer Vision, IEEE, p. 3289–3297.
[6] Pham, T. T. T., Le, T. L., Vu, H., and Dao, T.
K. (2017) Fully-automated person re-identification in
multi-camera surveillance system with a robust ker-
nel descriptor and effective shadow removal method,
Image and Vision Computing, Elsevier, p. 44-62.
[7] Kuhn, Harold W (1955) Naval research logistics
quarterly, Wiley Online Library, p. 83–97.
[8] Zhang, Zhengyou (2000) A flexible new technique for
camera calibration, Pattern Analysis and Machine In-
telligence, IEEE, p. 1330–1334.
[9] Thi Thanh Thuy Pham, Anh Tuan Pham, Hai Vu
(2015) A new technique for linking person trajecto-
ries in surveillance camera network, Conference on
Fundamental and Applied IT Research (FAIR), p. 8–
15.
[10] Bo, Liefeng and Ren, Xiaofeng and Fox, Dieter
(2010) Kernel descriptors for visual recognition, Ad-
vances in Neural Information Processing Systems
(NIPS), Vancouver, Canada, p. 244–252.
[11] Dao, Trung-Kien and Pham, Thanh-Thuy and
Castelli, Eric (2013) A robust WLAN positioning sys-
tem based on probabilistic propagation model, 9th In-
ternational Conference on Intelligent Environments
(IE), IEEE, p. 24–29.
[12] Goldsmith, A. (2005), Wireless communications,
Cambridge university press.
148 Informatica 41 (2017) 133–148 T.T.T. Pham et al.
[13] Roberts B. and Pahlavan K. (2009) Site-specific rss
signature modeling for wifi localization, In Global
Telecommunications Conference, IEEE, p. 1–6.
[14] Munoz D., Lara F.B., Vargas C., and Enriquez-
Caldera R. (2009), Position location techniques and
applications, Academic Press.
[15] Haupt, Randy L and Haupt, Sue Ellen (2004) Practi-
cal genetic algorithms, John Wiley & Sons.
[16] Jungmin So, Joo-Yub Lee, Cheal-Hwan Yoon, Hyun-
jae Park (2013) An Improved Location Estimation
Method for Wifi Fingerprint-based Indoor Localiza-
tion, International Journal of Software Engineering
and Its Applications.
[17] Arsham Farshad, Jiwei Li, Mahesh K. Marina, Fran-
cisco J. Garcia (2013) A Microscopic Look at WiFi
Fingerprinting for Indoor Mobile Phone Localization
in Diverse Environments, International Conference
on Indoor Positioning and Indoor Navigation.
[18] Bernardin, Keni and Stiefelhagen, Rainer (2008)
Evaluating multiple object tracking performance: the
CLEAR MOT metrics, EURASIP Journal on Image
and Video Processing, Springer, p. 1–10.
[19] Ben Shitrit, Horesh and Berclaz, Jerome and Fleuret,
François and Fua, Pascal (2013) Tracklet-based
Multi-Commodity Network Flow for Tracking Mul-
tiple People, No. EPFL-PATENT-186751, WO.
[20] Kothari, Nisarg and Kannan, Balajee and Glasgwow,
Evan D and Dias, M Bernardine (2012) Robust indoor
localization on a commercial smart phone, Procedia
computer science, Elsevier, p. 1114–1120.
Informatica 41 (2017) 149–158 149
Persons-In-Places: a Deep Features Based Approach for Searching a Specific
Person in a Specific Location
Vinh-Tiep Nguyen, Thanh Duc Ngo, Minh-Triet Tran, Duy-Dinh Le and Duc Anh Duong
University of Information Technology, University of Science
E-mail: {tiepnv, thanhnd}@uit.edu.vn, tmtriet@fit.hcmus.edu.vn, {duyld,ducda}@uit.edu.vn
Keywords: video instance search, deep neural network, location search, person search
Received: March 29, 2017
Video retrieval is a challenging task in computer vision, especially with complex queries. In this paper, we
consider a new type of complex query which simultaneously covers person and location information. The
aim to search a specific person in a specific location. Bag-Of-Visual-Words (BOW) is widely known as
an effective model for presenting rich-textured objects and scenes of places. Meanwhile, deep features are
powerful for faces. Based on such state-of-the-art approaches, we introduce a framework to leverage BOW
model and deep features for person-place video retrieval. First, we propose to use a linear kernel classifier
instead of usingL2 distance to estimate the similarity of faces, given faces are represented by deep features.
Second, scene tracking is employed to deal with the cases face of the query person is not detected. Third,
we evaluate several strategies for fusing individual person search and location search results. Experiments
were conducted on standard benchmark dataset (TRECVID Instance Search 2016) with more than 300 GB
in storage and 464 hours in duration.
Povzetek: V prispevku je opisana metoda povpraševanja po osebi in lokaciji iz video vsebin.
1 Introduction
With the rapid growth of video recording devices, many
videos from diverse domains such as professional or ama-
teur film making, surveillance and home recording are be-
ing created. These vast video collections are being shared
on video broadcasting sites (e.g., YouTube). One of the
most fundamental needs is to help users find exactly what
they are looking for in video databases. To search di-
rectly on videos, we consider an approach—visual instance
search on video databases. The term instance search (INS)
is defined formally by TRECVID [13]: finding video seg-
ments of certain specific object, place or person, given vi-
sual examples from a video collection. There are varieties
of query types including rich-textured, fairly-textured or
deformable object. These make instance search is a very
challenge task since we do not know any prior information
about the query.
The objective of this problem is to find the person and
the location in a large-scale video dataset. This type of
query is important since person and location are two most
popular query objects. It has many applications in prac-
tice such as: surveillance systems, personal video archive
management. This query topic is also a very hard topic
because there are many variations in size, light condition,
view change. Figure 1 gives an example of this type of
query. Images in the first row are examples of a pub that
a user want to search. These images cover multiple views
of a location with many irrelevant or noisy objects such
as humans, decorations. These objects may cause low re-
trieval accuracy due to noisy features. Images in the second
row are examples of the person that the user also needs to
find if he appears at the pub. Persons are special query ob-
jects because they are 3D object with multiple views and
deformable with different cloths texture features. All of
these make our retrieval task with this compound query to
be more challenging.
A very natural approach is to combine the scores of rec-
ognizing face and location. There are some challenges in
this approach:
– The scores are independent and incomparable. It
makes typical fusion techniques such as average fu-
sion inefficient.
– Frames with very clear and recognizable faces often
have large proportions in appearance but less infor-
mation about the context scene. Hence, frames which
have higher score in recognizing a face may have
lower score or low rank for a location, and vice versa.
This gives the low performance when simply combin-
ing these scores.
– In a video scene that contains a person and a location,
both of them are not always shown perfectly: the per-
son may change their head pose in multiple directions
while the location may change points of view by the
time. However, query examples do not cover all views
of target objects.
Most state-of-the-art object instance retrieval systems
are based on bottom-up approach with a very well-known
150 Informatica 41 (2017) 149–158 V.-T. Nguyen et al.
Figure 1: A query topic includes location examples (first row images) and person examples (second row images) marked
by magenta boundaries.
model Bag-of-Visual-Words (BOW) [23] which benefits
from powerful local descriptors for matching textures, then
checks the geometric consistency to further improve the ac-
curacy. This approach relies on the key assumption that two
similar objects share significant number of local patches
that can be matched against each other.
When searching on rich-textured instances which con-
tain enough discriminative texture patterns (e.g. locations,
buildings, book covers, paintings, etc.), there are some am-
biguous patches that share similar shapes with the query
instance but belong to an irrelevant object. However, ratio
of these patches is low, thus the similarity scores of images
containing correct instance are higher than incorrect ones.
Moreover, its extensions e.g. geometric consistency check-
ing [16][30], query expansion [8][7][1] also further signif-
icantly improve the performance of the searching system.
When searching on highly flexible appearance object
such as human, the performance is still very low due to
the limited capacity of representation of the BOW model.
For the first video segment that the query person appears,
the problem is equivalent to face recognition without using
other information such as cloths texture feature. From that
segment to the end of a scene, people are likely to be in the
same place even his/her face disappears. In this paper, we
propose a system which leverages both BOW and Convolu-
tional Neural Network (CNN) based feature for retrieving
this new type of query. For location search, we combine
BOW based and CNN based features to improve the per-
formance. For person search, we use VGG-face feature for
recognizing the first video shot that the target person ap-
pears. In stead of using distance metric such L2, we pro-
pose to use a linear kernel method to learn high-level fea-
ture encoded by a deep CNN. Finally, in order to boost the
recall of the system, we implement scene tracking to keep
track shots following the high response ones.
The rest of this paper is organized as follows. Section 2
presents related work. Details of our instance search frame-
work is presented in Section 3. Section 4 presents our ex-
periment results on TRECVID dataset. Finally, Section 5
concludes the paper.
2 Related work
To improve the performance of INS systems, multiple tech-
niques have been proposed, such as rootSIFT feature [1],
large vocabulary [16], soft assignment[17]. Among them,
spatial verification is one of the most effective approaches,
and also serves as the prerequisite step for other advanced
techniques such as query expansion. Spatial verification
can be classified into two categories: spatial reranking [16]
[30] [33] and spatial ranking [10] [5] [21]. These ap-
proaches work very well on big and rich-textured object
such as location.
To further improve the performance, Wan et al. ex-
plore deep learning techniques with application to instance
search task[31]. They show that deep learning feature from
CNN model pre-trained on large-scale dataset can be used
for representing image or object in new instance search
task. Moreover, by retraining the deep models on the new
domain, the retrieval performance could be boosted signif-
icantly. Although the amount of training data is only a few
examples per query object, pre-trained network with pa-
rameters learned from previous large-scale dataset makes
fast convergence on new data domain.
In addition to retrain the CNN network, Babenko et al.
also investigate the performance of compressed deep fea-
tures, where plain PCA or a combination of PCA with dis-
criminative dimensionality reduction result in very short
codes with state-of-the-art performance [4]. They explain
that passing an image through the network discards much
of the information that is irrelevant for classification (and
for retrieval). Thus, CNN based neural codes from deeper
layers retain less (useless) information than unsupervised
aggregation-based representations. Therefore PCA com-
Persons-In-Places: a Deep Features Based Approach. . . Informatica 41 (2017) 149–158 151
pression works better for neural codes. Beside deep en-
coding technique, the authors also introduces and evalu-
ates a new simple and compact global image descriptor and
investigates the reasons underlying its success [3]. They
show that, feature aggregation using sum-pooling tech-
nique outperform when using max-pooling on deep fea-
tures from fully connected layers [18], VLAD[2], demo-
cratic aggregation[11] which successfully applied on SIFT
feature.
Another problem this paper focuses on is face recogni-
tion in images and videos. We classify many methods pro-
posed in the literature into two groups: the ones that do not
use deep learning and the ones that do. For the first group
(also named “shallow" methods), they start by extracting a
representation of the face image using hand-crafted local
image descriptors such as SIFT, LBP, HOG [9][12][32];
then they aggregate such local descriptors into an overall
face descriptor by using a pooling mechanism, for example
the Fisher Vector [14][22].
This work is concerned mainly with deep architectures
which currently reach the state-of-the-art performance.
The idea of such methods is to use a CNN feature extrac-
tor with parameters learned by composing several linear
and non-linear operators. One of the representative meth-
ods for this approach is DeepFace [28]. This method uses
a deep CNN trained to classify faces using a dataset of 4
million examples of 4000 persons. The goal of training is
to minimize the distance between congruous pairs of faces
(i.e. portraying the same identity) and maximize the dis-
tance between incongruous pairs, a form of metric learning.
The authors later extended this work in [29], by increasing
the size of the dataset to 10 million persons and 50 im-
ages per person. They proposed a bootstrapping strategy
to select identities to train the network and showed that by
controlling the dimensionality of the fully connected layer
the generalisation of the network can be improved.
The DeepId series of papers by Sun et al.
[24][26][27][25], extensions of the DeepFace, each
of which improves the performance on LFW and YFW
incrementally and steadily. A number of new ideas were
introduced by incorporating over this series of papers,
including: using multiple CNNs [26], a Bayesian learning
framework [6] to train a metric, multi-task learning
over classification and verification [24], different CNN
architectures which branch a fully connected layer after
each convolution layer [27], and very deep networks
[25]. Compared to DeepFace, DeepID does not use 3D
face alignment, but a simpler 2D affine alignment and
trains on combination of CelebFaces [26] and WDRef
[6]. However, the final model in [25] is quite complicated
involving around 200 CNNs.
Recently, a research from Google [20] trains a CNN us-
ing a massive dataset of 200 million face identities and 800
million image face pairs. Their triplet-based loss considers
two congruous (a,b) and a third incongruous face c in com-
parison. Differently from other metric learning approaches,
their goal is to make a closer to b than c; comparisons are
always relative to a pivot face. In training this loss is ap-
plied at multiple layers, not just the final one.
In this paper, we follow the VGG-Face descriptor net-
work [15] which designs a procedure that is able to assem-
ble a large-scale dataset, with small label noise, whilst min-
imizing the amount of manual annotation involved. They
use weaker classifiers to rank the data presented to the an-
notators for reranking. They also show that a deep CNN
can achieve results comparable to the state-of-the-art with
appropriate training without any special techniques.
In other to apply in a new task (instance search) and data
domain, instead of using the activation of the last layer,
we propose to use the feature extracted from one of the
fully connected layers with a linear classifier (e.g support
vector machine with linear kernel) to train face model for
the query person. To further improve the performance of
the instance search system, especially in the case that the
target person turns his/her back to the camera, we propose
to combine person tracking with scene tracking.
3 Proposed framework
This section describes our proposed framework and its con-
figurations. Our proposed system includes four main mod-
ules: BOW based retrieval, location learning for verifica-
tion, face learning for recognition and final fusion. Figure 2
sketches out the work flow of main components in our INS
system. Given a compound query topic including person
and location examples, our goal is to rank video shots con-
taining that combination. Each example is a video frame
of location or person captured at a specific point of view as
shown in Figure 1. In our framework, instead of using all
frames of a video shot, we perform key frame extraction at
5 frames per second for saving computational cost.
For simplicity of notation, we only consider a set of
query examples and key frames of a shot in the video
dataset. Other shots are processed similarly. Firstly,
for each location example, we extract local features us-
ing Hessian-Affine detector and rootSIFT descriptor, then
quantize using a codebook trained on video database. In
order to reduce the effect of noisy features given by irrel-
evant persons, we remove all visual words inside bound-
ing boxes detected by a person detector. In this paper, we
use Faster RCNN[19] with pre-trained network on PAS-
CAL VOC 2007 to find person regions. Each frame of lo-
cation is finally represented by a BOW feature vector Lk
with tf-idf weighting scheme. For each person example,
we only use the information detected by face detector since
the target person may change clothes by the time. Each
face bounding box is described by a CNN based descriptor
and represented by a feature vector Fp.
Since location and person examples are independent,
we can compute two rank lists independently. However,
BOW model could perform in large-scale video data, we
use location features to retrieve rank lists as the first step,
then use face features for later reranking. Top K retrieved
152 Informatica 41 (2017) 149–158 V.-T. Nguyen et al.
Figure 2: Framework overview.
shots based on SBOW similarity score are then used for the
reranking stage. Note that, BOW model is a non-structured
model which does not take into account the spatial rela-
tionship between visual words. To remove irrelevant shot,
we combine both RANSAC based algorithm and learning
based approach for high level feature vector produced by a
very deep CNN network VGG-19.
The second part of our system is person recognition
based reranking. A person example includes a color image
and its mask which helps the system to separate interested
person from irrelevant objects. In this case, we only focus
on face feature since the target person changes the cloths
over time. We use a face detector and face descriptor to ex-
tract representative feature of the query person. After this
stage, each person is represented by a set of deep feature
vectors. A typical way to compare face features is using
symmetric distance or similarity score. In this approach,
each component of a feature vector is processed evenly.
However, this vector is a high-level feature which describes
many parts of a face. Some of them are important and some
are not. Hence, we propose to use a linear classifier to learn
the weights of a face feature, then compute the similarity
score between the face model and a video shot.
Finally, we propose a final fusion step in which, it takes
into account all components of the system including: BOW
based location search, CNN based irrelevant location re-
moval, face based reranking and scene tracking. In a video
scene that contains a person and a location, both of them
are not always shown perfectly: a person may change
their head pose in multiple directions while a location may
change points of view by the time. However, query exam-
ples of face and location are limited and incomplete. To
propagate the score of positive shot, we inherit that value
for the next scenes with a multiplication factor.
3.1 Location search
In the first stage of the system, we retrieve top K shots
that is similar to the location examples using BOW model
with local feature. In this paper, we use the state-of-the-art
configuration of BOW framework that have been used for
image retrieval. Local features of each key frame of a shot
are extracted using Hessian-affine detector and rootSIFT
feature descriptor. Each feature is represented by a 128-
dimensional vector. All features gathered from database
video frames are clustered using approximate K-Means al-
gorithm (AKM) with a very large number of codewords.
Since the limitation of hardware computation, only 100
million features are randomly sampled to train 1 million
codewords. These features are then quantized using the
codebook with hard-assignment strategy. Finally, each
video frame is represented by a very sparse BOW fea-
ture vector using tf-idf (term frequency-inverse document
frequency) weighting scheme. Because the rank list only
counts video shots not video frames, we aggregate all BOW
vectors of frames of a shot into a single one for compact
representation and fast retrieval. Using the following en-
coding scheme, frame j-th of i-th video shot is represented
by a BOW feature vector Si,j . We accumulate all vectors
of a shot into a single one using average pooling:
Si =
1
n
n∑
j=1
Si,j (1)
where, n is number of key frames of the shot.
Feature vectors of video shots are then transferred to
build an inverted index which helps to significantly boost
Persons-In-Places: a Deep Features Based Approach. . . Informatica 41 (2017) 149–158 153
Figure 3: Two images illustrate a location example (the left-hand side one) and a query person example (the-right hand
side one). For the location example, there may have some irrelevant persons (marked by yellow boundaries) whose noisy
visual words take part in the BOW feature vector of the frame. For the person example (marked by magenta boundary),
face feature is one of the most important features for retrieving.
the speed of retrieval. The similarity between the i-th shot
and the given location is computed by the following for-
mula:
LSi =
1
n′
n′∑
k=1
asym(Lk, Si) (2)
where, n′ is the number of query examples and asym is an
asymmetrical similarity score[34].
Top K shots returned by BOW model are then reranked
in the next steps. One important parameter in this initial
step is K, the threshold for selecting top ranked shots. By
observing the z-score normalized distance of all query ex-
amples, we found that they have the same distribution as
shown in Figure 4. Intuitively, we fixed the cut off thresh-
old for top K shots is −2.5.
The main assumption of BOW model is that two similar
objects share significant number of local patches that can
be matched against each other. The chosen query examples
are often captured in perfect views due to the meticulous-
ness of user while database frames are not always. When
changing point of view significantly, local feature based
BOW model gives bad retrieval performance. To be more
robust with point of view, we represent each video frame by
a high-level feature vector derived from a fully connected
layer of CNN network. We use a very deep pre-trained
network, i.e. VGG-19, and remove the last layer which
commonly used for classification task. Video frames are
re-sized and normalized before transferring to the feed for-
ward network. The output of the network is a 4096 dimen-
sional feature vector representing the whole video frame.
Comparing two video frames is equivalent to comparing
their representing feature vectors. However, using sym-
metric metric such as Euclidean distance (L2) may result
in low accuracy since all components of a feature vector
have the same role. In fact, for each location, some of the
components are important. A learning method is proposed
to magnify the role of these key components.
3.2 Face feature learning for reranking
The second part of the query is person examples. Face
recognition is a very popular approach to identify a per-
son. Faces are detected using DPM cascade detector [32]
applied in maximum 5 key frames per shot. Then, face
feature vector are extracted using VGG-Face descriptor, a
CNN based network[15]. Particularly, each face image is
represented by a 4096 dimensional deep feature vector.
After this stage, each person is represented by a set of
deep feature vectors {F1, F2, ..., Fm} where m is the num-
ber of face examples. We perform similarly to each frame
of a video shot. SFi,j,k represents feature vector extracted
from a face of a person in a video frame. A natural way
to compute the similarity between a person and a shot is to
take the minimum distance between all pairs of face feature
vectors. The distance formula is given as following:
FSi = min
l,j,k
L2(F
∗
l , S
∗
Fi,j,k
)
where F ∗l and S
∗
Fi,j,k
are normalized vector of Fl and
SFi,j,k , L2 is Euclidean distance metric.
Although this feature is designed to work with L2 dis-
tance metric, there is a big gap in performance. This
could be explained that a face feature vector does not have
the same weight for all components. With each face, the
weights of components are different. Therefore, we pro-
pose to learn these features by a large margin classifier with
a linear kernel. Each face candidate of a frame of a shot af-
ter transferred to the classifier will be scored by a value.
Positive values indicate positive example, and vice versa.
In this paper, we use Support Vector Machine (SVM)
with linear kernel to train face features of the target person.
Positive features are chosen from the query examples while
negative ones are from the last 50 persons of the initial rank
list returned from L2 distance based approach. After train-
ing with SVM algorithm, the target person is represented
154 Informatica 41 (2017) 149–158 V.-T. Nguyen et al.
Figure 4: Distribution of z-score normalized distance.
by a single model M .
3.3 Final fusion
This is our main contribution module which leverages the
power of BOW model, deep features and machine learn-
ing. At first, the rank list returned by BOW based loca-
tion search is then used as the input of geometric verifi-
cation step. Visual words of each database video frame is
then verified using RANSAC algorithm. The number of
inliers represents the similarity between a video frame and
query location. The output of geometric verification step
is the input of the irrelevant location removal step. Using
classifier learned from location examples, we classify each
video frame of a shot using linear kernel approach. The
output score of a shot is the average of all decision values
of frames in that shot. We remove shots which have neg-
ative decision values and transfer the remained ones to the
next step. In the face based reranking step, we use the face
model learned from query examples to recognize persons
of a video shot. The output score of shot i-th in this step is
the maximum decision value of all frames that belong to:
scorei = max
j,k
svm(M,S∗Fi,j,k)
where M is the face model, S∗Fi,j,k is normalized vector of
SFi,j,k and svm is the linear classifier. If scorei > 0, it
means that there is at least one frame containing the query
person in shot i-th and vice versa.
The final step of our system is scene tracking. To deal
with cases that the target person appears in a shot but his
face is unclear, we transfer the decision value from the last
positive shot to the next ones with small decreasing. Note
that, we only apply scene tracking to shots which have neg-
ative decision values. Assume that two consecutive shots
i-th and i+1-th have scores scorei > 0 and scorei+1 ≤ 0.
We update scorei+1 = 12scorei. We also update for the
maximum 5 shots with the same factor. The output of this
step is the rank list after sorting final score values in de-
scending order.
4 Experiment
4.1 Dataset
To demonstrate the advantage of the proposed method
on different types of query, we used TRECVID In-
stance Search (INS) datasets for evaluation. We used the
TRECVID INS benchmarks in year 2016 which was re-
leased by NIST. For experimentation, we name this dataset
as INS2016.
For the past six years (2010-2015) the instance search
task has tested systems on retrieving specific instances of
objects, persons and locations. They share the same collec-
tion of test videos with a master shot reference. Currently,
new query type will be tested by asking systems to retrieve
specific persons in specific locations. The dataset contains
approximately 244 video files extracted from the BBC Eas-
tEnders program with totally 300 GB in storage and 464
hours in duration. Each query topic of INS2016 consists
of two set of examples: location and person. For the per-
son set, each example includes an image and corresponding
mask to delimit the target entity with others. For location
set, only image examples are provided. This INS dataset
is very challenging due to the variety in query types: from
indoor to outdoor location, unclear to clear person.
Evaluation Protocol. There are 30 query topics or pairs
of person-location and about 470 thousand video shots in
this challenge. The system must return top 1000 shots that
are most similar to each given topic. The ground truth
files for each query are created manually and provided by
TRECVID organization. To evaluate the performance of
each method, we use the mean average precision (MAP)
as a standard measurement. Although some evaluations
of intermediate results such as location search when com-
bining deep features and BOW are expected, there already
has some reports about the performance of state-of-the-art
systems on individual query of last year challenges[13].
Therefore, in this paper, we only take care about the per-
formance of compound query.
4.2 Retrieval performance and visualization
In this section, we discuss some quantitative results of our
method evaluated against the ground truth gathered from
the TRECVID INS 2016. For ease of observation, we use
the following abbreviations with descriptions:
– Avg-Fusion: normalized scores of person and location
fusion.
– L2-Reranking: using our framework, after geometric
verification step, we rerank the initial top K list using
L2 distance for face features. The similarity score of a
frame is the opposite number of min-min distance be-
tween face examples and all face detected in frames of
a shot. The similarity score of a frame is the opposite
number of that distance value. We use mean function
for all similarity scores of frames in a shot to represent
Persons-In-Places: a Deep Features Based Approach. . . Informatica 41 (2017) 149–158 155
Table 1: Comparison between average fusion and reranking
methods.
Run MAP
Avg-Fusion 15.6
L2-Reranking 18.9
the final similarity (average pooling) (similar to other
methods in the experiment).
– CNN-Loc+L2-Reranking: similar to L2-Reranking
but we augment the CNN based location reranking
step after geometric reranking step.
– Linear Kernel: similar to the baseline CNN-loc+L2-
Reranking but we use linear kernel to learn face model
of the query person and compute similarity score with
candidate faces.
– Linear Kernel+scene tracking: similar to the Linear
Kernel, but we also apply scene tracking to deal with
frames that face of target person is not detected.
4.3 Average fusion for person-location
query
In many systems, average fusion is one of the simple and
effective methods to improve the retrieval performance.
However, for compound queries such as location-person,
average fusion is not good as face based reranking method
as shown in Table 1. It can be explained that, the scores of
each target location and person are independent and incom-
parable. Moreover, frames with very clear and recogniz-
able faces often have large proportions in appearance but
less information about the context scene. Hence, a frame
has higher score in recognizing a face may have lower score
or low rank for a location, and vice versa.
4.4 Deep feature for location reranking
In this section, we want to illustrate that, deep feature for
reranking improves the performance pretty much even for
rich-texture query object such as location. The experimen-
tal result is shown in Table 2. Past state-of-the-art systems
of TRECVID showed that, for rich-textured object such as
location, local feature based BOW model is one of the most
suitable choices. However, in case of real life videos, the
proportion of location evidences is very small. Using CNN
features of the query location, the system has more infor-
mation to keep scenes that seems to be removed by the cut-
off threshold in geometric verification step.
4.5 Face feature learning and scene tracking
Table 3 summarizes the results of using our different meth-
ods, measuring their relative performance in terms of the
Table 2: Comparison of retrieval systems with and without
high-level feature reranking.
Run MAP
L2-Reranking 18.9
CNN-Loc+L2-Reranking 19.8
Table 3: Experimental results on different configurations
for TRECVID INS 2016.
Run MAP
Linear Kernel + scene tracking 50.6
Linear Kernel 25.9
CNN-Loc+L2-Reranking 19.8
MAP score. From the table, we can see that the first
proposed method (Linear Kernel) performs much better
than the baseline one which only uses L2 distance (CNN-
Loc+L2-Reranking), showing a gain in the MAP from
19.8% to 25.9%. Moreover, with scene tracking step, the
final performance is significantly boosted from 25.9% to
50.6%.
Also, note that the scene tracking step not only keeps
the high precision but also improves the recall compared
to Linear Kernel method. Because there are many cases
that the target persons do not put their faces in front of
the camera, hence many shots are lost in the final rank
list. By using scene tracking, the total recall of the re-
trieval system is improved surprisingly. This can be ob-
served on the precision-recall curves as shown in Figure 5
where the curve of Linear Kernel+scene tracking is signif-
icantly higher than the other ones.
To show the efficiency of the proposed method compared
to the baseline system, we visualize the rank list returned
from the systems. The query topic is given in Figure 1. Top
six shots returned from the system using L2 distance and
Linear Kernel classifier are visualized in Figure 6. Each
row shows the key frames of a shot of a rank list. When us-
ing L2 distance, the precision is very low, that is the reason
why top six rank list of the baseline contains many irrel-
evant shots marked by red bounding boxes. Using Linear
Kernel classifier, the precision of the system is improved
significantly, hence the ratio of relevant shots is very high.
5 Conclusion
Inspired by recent successes of deep learning techniques, in
this paper, we attempt to leverage the powerful of deep fea-
ture in instance search task. We aim to use deep feature as a
tool for reranking the location search result by bridging the
semantic gap made by BOW model. Moreover, to search
for more difficult object which is deformable and could be
156 Informatica 41 (2017) 149–158 V.-T. Nguyen et al.
Figure 5: Precision recall curves when conducting experi-
ment on TRECVID INS 2016.
captured in different environments, we propose to apply a
machine learning approach to learn deep features extracted
from human face detected in video frame. In particular,
we investigate a framework of combining BOW model and
deep learning based feature with application to instance
search task with a new type of query topic: a specific per-
son in a specific location. By conducting experiments on
a large-scale dataset, we proved that our proposed method
significantly improves the performance of retrieval.
In future work, we will investigate on advanced deep
learning techniques such as retraining network with new
data generated from query examples. We also evaluate
the retrieval systems on other diverse datasets for more in-
depth empirical studies.
Acknowledgement
The video frames from BBC Eastenders video used in this
document are programme material copyrighted by BBC.
This research is funded by Vietnam National Univer-
sity HoChiMinh City (VNU-HCM) under grant number
B2017-26-01.
References
[1] R. Arandjelović and A. Zisserman. Three things ev-
eryone should know to improve object retrieval. In
Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), CVPR ’12,
pages 2911–2918, Washington, DC, USA, 2012.
[2] R. Arandjelović and A. Zisserman. All about VLAD.
In IEEE Conference on Computer Vision and Pattern
Recognition, pages 1578–1585, 2013.
[3] A. Babenko and V. S. Lempitsky. Aggregating deep
convolutional features for image retrieval. CoRR,
abs/1510.07493, 2015.
[4] A. Babenko, A. Slesarev, A. Chigorin, and V. Lem-
pitsky. Neural codes for image retrieval. In D. Fleet,
T. Pajdla, B. Schiele, and T. Tuytelaars, editors, Com-
puter Vision – ECCV 2014: 13th European Con-
ference, Zurich, Switzerland, September 6-12, 2014,
Proceedings, Part I, pages 584–599. Springer Inter-
national Publishing, Cham, 2014.
[5] Y. Cao, C. Wang, Z. Li, L. Zhang, and L. Zhang.
Spatial-bag-of-features. In IEEE Conference on Com-
puter Vision and Pattern Recognition, pages 3352–
3359, June 2010.
[6] D. Chen, X. Cao, L. Wang, F. Wen, and J. Sun.
Bayesian face revisited: A joint formulation. In Pro-
ceedings of the European Conference on Computer
Vision - Volume Part III, ECCV’12, pages 566–579,
Berlin, Heidelberg, 2012. Springer-Verlag.
[7] O. Chum, M. Perdoch, A. Mikulik, and J. Matas. To-
tal recall ii: Query expansion revisited. In IEEE Con-
ference on Computer Vision and Pattern Recognition,
pages 889–896, Los Alamitos, CA, USA, 2011.
[8] O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisser-
man. Total recall: Automatic query expansion with a
generative feature model for object retrieval. In IEEE
International Conference on Computer Vision, 2007.
[9] R. G. Cinbis, J. Verbeek, and C. Schmid. Unsuper-
vised Metric Learning for Face Identification in TV
Video. In ICCV 2011 - International Conference
on Computer Vision, pages 1559–1566, Barcelona,
Spain, Nov. 2011. IEEE.
[10] H. Jegou, M. Douze, and C. Schmid. Hamming em-
bedding and weak geometric consistency for large
scale image search. In Proceedings of the European
Conference on Computer Vision: Part I, ECCV ’08,
pages 304–317, Berlin, Heidelberg, 2008. Springer-
Verlag.
[11] H. Jégou and A. Zisserman. Triangulation embed-
ding and democratic aggregation for image search. In
CVPR - International Conference on Computer Vision
and Pattern Recognition, Columbus, United States,
June 2014.
[12] C. Lu and X. Tang. Surpassing human-level face ver-
ification performance on lfw with gaussian face. In
Proceedings of the AAAI Conference on Artificial In-
telligence, AAAI’15, pages 3811–3819. AAAI Press,
2015.
[13] P. Over, J. Fiscus, G. Sanders, D. Joy, M. Michel,
G. Awad, A. Smeaton, W. Kraaij, and G. Quénot.
Trecvid 2014 – an overview of the goals, tasks, data,
evaluation mechanisms and metrics. In Proceedings
of TRECVID 2014. NIST, USA, 2014.
Persons-In-Places: a Deep Features Based Approach. . . Informatica 41 (2017) 149–158 157
Figure 6: Result visualization of query from Figure 1. a) Top 6 rank list using L2 distance. b) Top 6 rank list using Linear
Kernel classifier.
[14] O. M. Parkhi, K. Simonyan, A. Vedaldi, and A. Zis-
serman. A compact and discriminative face track de-
scriptor. In IEEE Conference on Computer Vision and
Pattern Recognition. IEEE, IEEE, 2014.
[15] O. M. Parkhi, A. Vedaldi, and A. Zisserman. Deep
face recognition. In British Machine Vision Confer-
ence, 2015.
[16] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisser-
man. Object retrieval with large vocabularies and fast
spatial matching. In Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition,
2007.
[17] J. Philbin, M. Isard, J. Sivic, and A. Zisserman. Lost
in quantization: Improving particular object retrieval
in large scale image databases. In In CVPR, 2008.
[18] A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carls-
son. Cnn features off-the-shelf: An astounding base-
line for recognition. In Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition
Workshops, CVPRW ’14, pages 512–519, Washing-
ton, DC, USA, 2014. IEEE Computer Society.
[19] S. Ren, K. He, R. Girshick, and J. Sun. Faster R-
CNN: Towards real-time object detection with region
proposal networks. In Neural Information Processing
Systems (NIPS), 2015.
[20] F. Schroff, D. Kalenichenko, and J. Philbin. Facenet:
A unified embedding for face recognition and cluster-
ing. In The IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), June 2015.
[21] X. Shen, Z. Lin, J. Brandt, S. Avidan, and Y. Wu.
Object retrieval and localization with spatially-
constrained similarity measure and k-nn re-ranking.
In Computer Vision and Pattern Recognition (CVPR),
2012 IEEE Conference on, pages 3013–3020, June
2012.
[22] K. Simonyan, O. M. Parkhi, A. Vedaldi, and A. Zis-
serman. Fisher Vector Faces in the Wild. In British
Machine Vision Conference, 2013.
[23] J. Sivic and A. Zisserman. Video Google: A text
retrieval approach to object matching in videos. In
Proceedings of the International Conference on Com-
puter Vision, volume 2, pages 1470–1477, Oct. 2003.
[24] Y. Sun, Y. Chen, X. Wang, and X. Tang. Deep
learning face representation by joint identification-
verification. In Proceedings of the International Con-
ference on Neural Information Processing Systems,
NIPS’14, pages 1988–1996, Cambridge, MA, USA,
2014. MIT Press.
[25] Y. Sun, D. Liang, X. Wang, and X. Tang. Deepid3:
Face recognition with very deep neural networks.
CoRR, abs/1502.00873, 2015.
[26] Y. Sun, X. Wang, and X. Tang. Deep learning face
representation from predicting 10,000 classes. In Pro-
ceedings of the IEEE Conference on Computer Vision
158 Informatica 41 (2017) 149–158 V.-T. Nguyen et al.
and Pattern Recognition, CVPR ’14, pages 1891–
1898, Washington, DC, USA, 2014. IEEE Computer
Society.
[27] Y. Sun, X. Wang, and X. Tang. Deeply learned
face representations are sparse, selective, and robust.
CoRR, abs/1412.1265, 2014.
[28] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf.
Deepface: Closing the gap to human-level perfor-
mance in face verification. In The IEEE Conference
on Computer Vision and Pattern Recognition (CVPR),
June 2014.
[29] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Web-
scale training for face identification. In The IEEE
Conference on Computer Vision and Pattern Recog-
nition (CVPR), June 2015.
[30] G. Tolias and Y. S. Avrithis. Speeded-up, relaxed
spatial matching. In IEEE International Conference
on Computer Vision, ICCV 2011, Barcelona, Spain,
November 6-13, 2011, pages 1653–1660, 2011.
[31] J. Wan, D. Wang, S. C. H. Hoi, P. Wu, J. Zhu,
Y. Zhang, and J. Li. Deep learning for content-based
image retrieval: A comprehensive study. In Proceed-
ings of the ACM International Conference on Mul-
timedia, MM ’14, pages 157–166, New York, NY,
USA, 2014. ACM.
[32] L. Wolf, T. Hassner, and I. Maoz. Face recognition in
unconstrained videos with matched background simi-
larity. In in Proc. IEEE Conf. Comput. Vision Pattern
Recognition, 2011.
[33] W. Zhang and C.-W. Ngo. Searching visual in-
stances with topology checking and context model-
ing. In Proceedings of the ACM Conference on Inter-
national Conference on Multimedia Retrieval, ICMR
’13, pages 57–64, New York, NY, USA, 2013. ACM.
[34] C. Zhu, H. Jegou, and S. Satoh. Query-adaptive asym-
metrical dissimilarities for visual object retrieval. In
IEEE International Conference on Computer Vision,
ICCV 2013, Sydney, Australia, December 1-8, 2013,
pages 1705–1712. IEEE, 2013.
Informatica 41 (2017) 159–168 159
Another Look at Radial Visualization for Class-preserving Multivariate Data
Visualization
Van Long Tran
University of Transport and Communications, Hanoi, Vietnam
E-mail: vtran@utc.edu.vn
Keywords: data visualization, radial visualization, quality visualization
Received: March 24, 2017
Multivariate data visualization is an interesting research field with many applications in various fields of
sciences. Radial visualization is one of the most common information visualization concept for visualizing
multivariate data. However, radial visualization may display different information about structures of
multivariate data. For example, all points which are multiplicatives of given point may map to the same
point in the visual space. An optimal layout of radial visualization is usually found by defining a suitable
the order of data dimensions on the unit circle. In this paper, we propose a novel method that improves the
radial visualization layout for cluster preservation of multivariate data. The traditional radial visualizations
have viewpoint at the origin coordinate. The idea of our proposed method is finding the most suitable
viewpoint among the corners of a hypercube to look into the cluster structures of data sets. Our method
provides an improvement in visualizing class structures of multivariate data sets on the radial visualization.
We present our method with three kinds of quality measurements and prove the effectiveness of our method
for several data sets.
Povzetek: Predstavljena je vizualizacija multivariantnih podatkov.
1 Introduction
Many scientific and business applications produce large
data sets with increasing complexity and dimensionality.
While information is growing in an exponential way, data
are ubiquitous in our world. Data should contain some kind
of valuable information that can possibly be explored using
human knowledge. However, extracting meaningful infor-
mation in large scale data is a difficult task.
Information visualization techniques have been proven
to be of high value in gaining insight into these large data
sets. The aim of information visualization is to use the
computer-based interactive visual representations of ab-
stract and non-physically based data to amplify human cog-
nition. It aims at helping users to detect effectively and ex-
plore the expected, as well as discovering the unexpected,
to gain insight into the data [6].
A major challenge for information visualization is how
to present multidimensional data to analysts, because com-
plex visual structures occur. Data visualization methods of-
ten employ a map from multidimensional data into lower-
dimensional visual space. The reason is that visual space
representation is composed of two or three spatial coordi-
nates and a limited number of visual factors such as color,
texture, etc. However, when the dimensionality of the data
is high, usually from tens to hundreds, the mapping from
multidimensional data space into visual space imposes in-
formation loss. This leads to one of the big question in
information visualization [6]: How to project from a mul-
tidimensional data space into a low-dimensional space and
best preserve the characteristics of the data.
The order of data dimensions is a crucial problem for
the effectiveness of many multidimensional data visualiza-
tion techniques [3] such as parallel coordinates [13], star
coordinates [14], Radial visualization (Radviz) [10], scat-
terplot matrix [2], circle segments [4], and pixel recursive
pattern [15]. The data dimensions have to be positioned in
some one- or two- dimensional arrangement on the screen.
The chosen arrangement of data dimensions can have a
major impact on the expressiveness of the visualization
because the relationships among adjacent dimensions are
easier to detect than relationships among dimensions po-
sitioned far from each other. Dimension ordering aims to
improve the effectiveness of the visualization by giving rea-
sonable orders to the dimensions so that users can easily
detect relationships or pay more attention to more impor-
tant dimensions.
The Radviz technique is one of the most common visu-
alization techniques used in medical analysis [10, 11, 16].
Finding the optimal order of data dimensions in Radviz is
known to be NP-complete [3]. Although there have been
a number of proposed methods for solving the dimension
ordering problem in Radviz [16, 8], most of them are ex-
haustive or greedy local searches in the space of all permu-
tations of data dimensions. These methods are usually only
tested on some data sets with small number of dimensions.
One of the disadvantages of Radviz is that all multidi-
mensional points which differ by a multiplicative constant,
i.e., all points cp with a fixed point p and various non-zero
160 Informatica 41 (2017) 159–168 V.L. Tran
scalars c, number that map to the same position in the vi-
sual space. Thus, all these points separate in the original
space but they cannot be differentiated in the visual space.
This property is invariant for all permutations. Radviz can
be explained as a combination of a perspective projection
and a linear mapping with the viewpoint at the origin and
the view plane being a simplex. In this paper, we propose
another variant of Radviz that supports users visualizing
the data inside a hypercube from an arbitrary viewpoint at
the corners of the hypercube. Finding a suitable viewpoint
of the hypercube in an n-dimensional space has 2n possible
cases. In general, finding a good viewpoint is less compli-
cated than finding a good data dimensions permutation of
Radviz.
The remaining part of this paper is organized as follows.
In Section 2, we present related work with Radviz and
data dimensions reordering in multivariate data visualiza-
tion techniques. The inversion axes in Radviz are presented
in Section 3. In Section 4, we describe some methods for
measurement quality of class visualizations for multivari-
ate data in the visual space. In Section 5, we show the
effectiveness of our methods with five well known multi-
variate data sets in the case of classified data. In Section 6,
we make a comparison for five data sets with permutations
in Radviz with other algorithms. In Section 7, we present
our conclusion and future work.
2 Related work
Principal Component Analysis (PCA) is one of the
most common methods for the analysis of multivariate data
[12]. PCA is applied to visualizing multivariate data that is
a linear projection onto two or three eigenvectors. The gen-
eral linear mapping can be defined as P (x) = V x where
V is a matrix. PCA projects a multidimensional point x
into a space spanned by the two or three eigenvectors that
corresponding to the two or three largest eigenvalues of the
covariance matrix of the given data sets.
Star coordinates were introduced by Kandogan [14].
Star coordinates use a linear mapping with the ith column
of matrix transformation Vi = (cos
2πi
n
, sin
2πi
n
)T . Vec-
tors {Vi, i = 1, 2, . . . , n} are represented evenly on the
unit circle in the two-dimensional visual space. The au-
thor also introduced several techniques for interactions on
star coordinates, for example moving axes Vi in the vi-
sual space. In [5], 3D star coordinates are introduced with
Vi = (cos
2πi
n
, sin
2πi
n
, 1)T that extends the 2D star coor-
dinates by adding the third coordinates as summation of all
coordinates. Further properties can be found in [20, 17].
Long and Linsen [22] propose optimal 3D star coordi-
nates for visualizing hierarchical clusters in multidimen-
sional data.
Radviz was proposed by Hoffman et al. [10]. Radviz can
be explained as a perspective projection of the 3D star co-
ordinates with a view point at the origin and viewing plane
z = 1. A normalized Radviz and properties of Radviz are
presented in [7]. The important problem with Radviz is
the ordering of the dimensional anchors for a good viewing
of the multivariate data. In [19], the t-statistic method for
reordering dimensional anchors on the unit circle is intro-
duced. The t-statistic is applied for labelled data. Di Caro
et al. [8] proposed two methods for dimension arrangement
in Radviz based on an optimization problem for pair of sim-
ilarity matrix between data dimensions and neighbourhood
matrix between data dimensions on a unit circle [8]. Albu-
querque et al. [1] used the Cluster Density Measure (CDM)
for finding a good layout of Radviz. The authors propose a
greedy incremental algorithm to successively add data di-
mensions to the Radviz layout to determine a suitable order.
3 Radial visualization method
3.1 Radviz
Radviz was first introduced by Hoffman et al. in [10, 11],
and it could be regarded as an effective non-linear dimen-
sionality reduction method. Radviz directly maps multi-
dimensional data points into a visual space based on an
equibalance of spring systems. In Radviz, dimensional an-
chors are attached to springs. The stiffness of each spring
equals the value of the dimension corresponding to its di-
mensional anchor. The other end of each spring is attached
to a point in the visual space. The location of this point
ensures the equibalance of the spring systems.
Let x = (x1, x2, . . . , xn) be a data point in a hypercube
[0, 1]n. The dimensional anchors Si, i = 1, 2, . . . , n can be
easily calculated by the formula:
Si = (cos
2π(i− 1)
n
, sin
2π(i− 1)
n
), i = 1, 2, . . . , n.
For the spring systems to be equibalanced, we must have
n∑
i=1
xi(p−xi) = 0, and we have the location of p as follows:
p =
∑n
i=1 xiSi∑n
i=1 xi
. (1)
Thus, the multidimensional point x is represented by the
point p. Figure 1 shows how a sample x of an eight-
dimensional space is represented by a point p in a 2-
dimensional plot.
The important properties of the Radviz method are de-
scribed in [7]:
– If a multidimensional point with all x coordinates
have the same value, the data point lies exactly in the
origin of the graph. Points with approximately equal
dimensional values (after normalization) lie close to
the center. Points with similar dimensional values,
whose dimensions anchors are opposite each other on
the circle lie near the center.
Another Look at Radial Visualization for. . . Informatica 41 (2017) 159–168 161
Figure 1: Radviz visualizes a point in 8 dimensions.
The dimensions are represented by points, placed equally
spaced on the unit circle. An observation x is displayed at
position p corresponding to its attributes x1, x2, . . . , x8.
– If the point is a unit vector point, it lies exactly at
the fixed point on the edge of the circle where the
spring for that dimension is fixed. Points which have
one or two coordinate values significantly greater than
the others lie closer to the dimensional anchors (fixed
points) of those dimensions.
– The position of a point depends on the layout of the
particular dimensional anchors around the circle.
– Many points can be mapped to the same position. This
mapping represents a non-linear transformation of the
data that preserves certain symmetries.
– The Radviz method maps each data record to a point
in a multidimensional data set that is within the convex
hull of the dimensional anchors.
We can consider the Radviz nonlinear mapping as a
combination of a perspective projection with the viewer
at o = (0, 0, . . . , 0) on a simplex
n∑
i=1
xi = 1, V (x) =
(
n∑
i=1
xi)
−1x and a linear mapping as in the Star coordinates
[14]LS(x) =
n∑
i=1
xiSi. The Radviz mapping can be rewrit-
ten as follows:
R(x) = LS(V (x)) = (
n∑
i=1
xi)
−1
n∑
i=1
xiSi.
3.2 Inversion Radviz
We propose a method that supports users in viewing the
hypercube at arbitrary corner of the unit hypercube. We
assume that the view is a point p = (p1, p2, . . . , pn) ∈
{0, 1}n. The simplex at the point p is a hyperplane (πp)
that goes through n points (p1, . . . , 1 − pi, . . . , pn), i =
1, 2, . . . , n. The equation of the simplex is determined as
follows:
n∑
i=1
(1− 2pi)xi = 1−
n∑
i=1
pi.
We can rewrite the above equation of the hyperplane as
(πp) :
∑
pi=0
xi +
∑
pi=1
(1− xi) = 1.
We find the position of the multidimensional point x =
(x1, x2, . . . , xn) ∈ [0, 1]n in the visual space. The coor-
dinates of the point x with respect to the origin p and the
basic vectors(
(1− 2p1)e1, (1− 2p2)e2, . . . , (1− 2pn)en
)
,
is denoted by
xp = (
x1 − p1
1− 2p1
,
x2 − p2
1− 2p2
, . . . ,
x2 − p2
1− 2pn
),
or
xp =
(
p1 + (1− 2p1)x1, . . . , pn + (1− 2pn)xn
)
,
where (e1, e2, . . . , en) are the standard basic vectors ofRn.
Obviously, the coordinates of the point x are the coordi-
nates of the vector x − p with respect to the vector basic
systems above.
The perspective projection V maps a point xp onto the
hyperplane (πp) at the point Vp(x) where
Vp(x) =
(
p1 + (1− 2p1)x1, . . . , pn + (1− 2pn)xn
)
∑
pi=0
xi +
∑
pi=1
(1− xi)
.
Figure 2 displays the viewpoint p, the view plane (πp), and
the location Vp(x) of the multidimensional point x on the
hyperplane (πp).
Figure 2: The perspective projection at corner p.
The Radviz projection at the point p is defined as
P (x) =
n∑
i=1
(
pi + (1− 2pi)xi
)
Si∑
pi=0
xi +
∑
pi=1
(1− xi)
,
162 Informatica 41 (2017) 159–168 V.L. Tran
or
P (x) =
∑
pi=0
xiSi +
∑
pi=1
(1− xi)Si∑
pi=0
xi +
∑
pi=1
(1− xi)
.
The ith coordinate of the point x corresponding to pi =
1 is changed to 1 − xi. We propose an inversion Radviz
(iRadviz for short) to project the multidimensional point x
onto the visual space as follows:
Rp,S(x) =
∑
pi=0
xiSi +
∑
pi=1
(1− xi)Si∑
pi=0
xi +
∑
pi=1
(1− xi)
(2)
Figure 3 shows the Radviz and iRadviz to visualize a
synthetic data set in three dimensional space that called as
3D data set. The 3D data set contains 700 points which split
into seven clusters. Each cluster has 100 points at the seven
vertices of the cube except vertex (1, 1, 1). Figure 3 (left)
shows the traditional Radviz visualizing the 3D data set.
One cluster at the origin (0, 0, 0) is spread on the simplex.
Radviz visualizes three dimensional space data set that is
not affected by permutation. Figure 3 (right) shows the 3D
data set with iRadviz using viewpoint (1, 1, 1) where the
seven clusters are perfectly separated.
For interaction, users can select a dimensional anchor pi
in Radviz and change this vertex into 1 − pi. For finding
the optimal viewpoint of the iRadviz of the given data set,
we need a quality measurement to define a suitable view of
a multidimensional data set.
4 Quality measurement
Suppose data set X = {xi : 1 ≤ i ≤ n} is classified into
K classes and each class is labeled by C = {1, 2, . . . ,K}.
We denote nk a the number of data points in the kth class.
In this section, we present briefly three methods to measure
quality in iRadviz for visualizing supervised data. Without
loss of generality, we also denote the data set that is pro-
jected in the visual space by X = {xi : 1 ≤ i ≤ n} ⊂ R2.
4.1 Class distance consistency
For each class, we denote ck as the centroid of the kth class.
A data point x belongs to a particular class if the distance
from the data point x to the centroid of this class is smallest.
Hence, we denote
class(x) = arg min
1≤k≤K
||x− ck||.
A data point x is correctly represented if its label is the
same as its class, otherwise the data point x a miss.
The Class Distance Consistency (CDC) [21] of data set
X = {xi : 1 ≤ i ≤ n} is defined as the number of cor-
rectly represented data points, i.e.,
Q(CDC, X) =
|xi : label(xi) = class(xi)|
n
. (3)
The CDC quality measurement for class visualization is ap-
plicable for a spherical shape of clusters.
4.2 Cluster density measurement
The quality Cluster Density Measurement (CDM ) [1] is
defined as follows:
Q(CDM, X) =
K∑
i,j=1
d2ij
rirj
, (4)
where dij = ||ci − cj || is the Euclidean distance between
two cluster centroids, and ri is an average radius of the ith
cluster, i.e.,
ri =
∑
label(x)=i
||x− ci||
ni
.
The high value quality presents well defined cluster sep-
arations with small intra-cluster distances and large inter-
cluster distances. Hence, the higher the quality measure is,
the better is the visualization of the supervised data set.
4.3 Conditional entropy
The Havrda-Charvat’s structural α-entropy [9] is defined as
Hα(X) =
2α−1
2α−1 − 1
(
1−
n∑
i=1
pα(xi)
)
, α > 0, α 6= 1.
A conditional Havrda-Charvat’s structural α-entropy [18]
for class visualization quality is defined as follows:
Hα(C|X) =
∫
p(x)Hα(C|X = x)dx
=
2α−1
2α−1 − 1
(
1−
K∑
j=1
∫
pα(j|x)p(x)dx
)
.
We can estimate the conditional entropy Hα(C|X) as fol-
lows:
Hα(C|X) =
2α−1
2α−1 − 1
(
1− 1
n
K∑
j=1
n∑
i=1
pα(j|xi)
)
.
Assume each data point xi is classified into only one class,
i.e., p(j|xi) = 1 for the jth class and p(j|xi) = 0 for
any other class. The conditional entropy achieves minimal
value.
When α = 2, we have the quadratic entropy:
H2(C|X) = 2
(
1− 1
n
K∑
j=1
n∑
i=1
p2(j|xi)
)
.
By Bayes’ theorem, we have
p(j|x) = p(j)p(x|j)
p(x)
.
Another Look at Radial Visualization for. . . Informatica 41 (2017) 159–168 163
Figure 3: The synthetic 3D data visualization. (Left) Traditional Radviz. (Right) iRadviz with viewpoint (1, 1, 1).
The prior probability is estimated by
p(j) =
nj
n
.
The density p(x|j) and p(x) are estimated by nonparamet-
ric techniques as the Parzen window method. Consider a
small region R(x) that contains x and has area V . Assume
the region R(x) contains kj(x) points of the jth class and
k(x) points of the data set. We estimate the density by
p(x|j) = kj(x)
njV
,
and p(x) =
k(x)
nV
. Therefore, the conditional probability
p(j|x) can be estimated by
p(j|x) =
nj
n
kj(x)
njV
k(x)
nV
=
kj(x)
k(x)
.
The quality entropy is defined as following
Q(ENT,X) = 1− 1
n
n∑
i=1
K∑
j=1
(kj(xi)
k(xi)
)2
(5)
The lower the quality entropy is, the better is the clustering
visualization. For calculating the entropy quality, we di-
vide the square region that contains all data set into N ×N
grid cells. The grid size N in two-dimensional space is es-
timated by the k-nearest neighbor. For each cell c, we have
9 neighbor cells, and on average in 9 cells we have
9n
N2
points. The grid size N is calculated by
9n
N2
=
√
n or
N = 1 +
[
3 4
√
n
]
.
For each cell c, we store the class point counts c =
(c1, c2, . . . , cK), where cj is the number of point of the
jth class falling into the cell c. For each point x that
falls in the cell c, region R(x) contains all cells that are
neighbors with the cell c. We have kj(x) =
∑
c∈R(x)
cj and
k(x) =
∑
c∈R(x)
kj(x). The complexity for computing the
entropy quality is O(Kn), i.e., it has linear time complex-
ity.
5 Experimental results
We tested our approach with five data sets. For each data
set, we find the viewpoint for the iRadviz based on the three
quality measurements presented in the Section 4.
The first well known data set is called the Iris data set1.
The Iris data set contains 150 data points, four attributes:
X1 (sepal length), X2 (sepal width), X3 (petal length), X4
(petal width) and three classes: Setosa (50 data points),
Versicolour (50 data points), and Virginica (50 data points).
Figure 4 shows the iRadviz approach for visualizing the
Iris data set. Classes are encoded by different colors. One
class (red) is separated perfectly with two other classes. In
Figure 4 (left) with inversion of the axes X2, X3, X4 and
Figure 4 (right) with inversion of the axes X1, X2, X3, X4.
These figures show three classes better separated than in
Figure 4 (middle) without inversion the axes.
The second data set is named the Wine data set2. The
Wine data set includes 178 data points with 13 attributes:
X1(Alcohol), X2 (Malic acid), X3 (Ash), X4 (Alcalin-
ity of ash), X5 (Magnesium), X6 (Total phenols), X7
(Flavanoids), X8 (Nonflavanoid phenols), X8 (Proantho-
cyanins), X10 (Color intensity), X11 (Hue), X12 (OD280 /
1http://archive.ics.uci.edu/ml/datasets/Iris
2http://archive.ics.uci.edu/ml/datasets/Wine
164 Informatica 41 (2017) 159–168 V.L. Tran
Figure 4: The Iris data. (Left) The best iRadviz visualization based on CDC quality. (Middle) The best iRadviz visualiza-
tion based on CDM quality. (Right) The best iRadviz visualization based on Entropy quality.
Figure 5: The Wine data. (Left) The best CDC quality of iRadviz visualization. (Middle) The best quality CDM of iRadviz
visualization. (Right) The best quality Entropy of iRadviz visualization.
Figure 6: The Y14c data. (Left) The best quality CDC on iRadviz. (Middle) The best quality CDM on iRadviz. (Right)
The best quality Entropy on iRadviz.
OD315 of diluted wines), and X13 (Proline). The Wine
data set is classified into three classes: class 1 (59 data
points), class 2 (71 data points), and class 3 (48 data
points). Figure 5 shows the Wine data set with a differ-
Another Look at Radial Visualization for. . . Informatica 41 (2017) 159–168 165
ent viewpoint using iRadviz. The different colors repre-
sent different classes of the Wine data set. Figure 5 (left)
shows the best iRadviz visualization for the Wine data set
with highest CDC quality where inversion was applied to
axes X4, X5, X7, X10. Figure 5 (middle) shows the best
iRadviz visualization for the wine data set with highest
CDM quality where inversion has been applied to axes
X1, X2, X3, X4, X8, X9, X11, X12, X13. Figure 5 (right)
shows the best iRadviz visualization for the wine data set
with highest Entropy quality where inversion has been ap-
plied to axes X6, X7, X10.
The third data set is a synthetic data set, that contains
480 data points with ten attributes and partitions into 14
clusters. Figure 6 shows three views of the Y14c data with
several different viewpoints in iRadviz. In this figure, the
inversion axes are highlighted by red color. Figure 6 (left)
shows the best iRadviz class visualization of this data on
the CDC quality with inversion axes 2, 3, 4, 5, 6, 7. Clusters
shown in this figure are well separated. Figure 6 (middle)
shows the best iRadviz based on highest CDM quality with
inversion axes 1, 2, 3, 6, 10. Several clusters are overlap-
ping in this visualization. Figure 6 (right) shows the best
iRadviz based on highest Entropy quality with inversion
axes 1, 2, 3, 4, 6, 9. This figure shows that clusters are per-
fectly separated. The Y14c data set contains two clusters
that have an different a scale. These clusters are fully over-
lapped on the Radviz with any permutation of dimensional
anchors.
The fourth data set is named Italian Olive Oils data
(Olive for short)3. The Olive data set consists of eight
attributes about eight fatty acids (X1 palmitic, X2 palmi-
toleic, X3 stearic, X4 oleic, X5 linoleic, X6 linolenic, X7
arachidic, X8 eicosenoic) and 572 data samples. The Olive
data set is classified into nine clusters. Each cluster cor-
responds to one of nine areas in Italy. Figure 7 shows the
iRadviz class visualization of the Olive data set that shows
the best quality based on CDC (left), CDM (middle), and
Entropy (right). Figure 7 (left and right) classes are more
separated than the classes in Figure 7 (middle).
The last data set is called Ecoli4. The Ecoli data contains
336 data samples and each data sample consists seven at-
tributes. The Ecoli data set is partitioned into eight clusters
with 143, 77, 52, 35, 20, 5, 2, 2 data samples respectively.
The three last clusters contain very small data amounts of
samples. Figure 8 shows the class visualization using iRad-
viz with the best quality based on CDC (left), CDM (mid-
dle), and Entropy (right).
6 Comparison and discussion
In this section, we present some quality measurements of
our proposed method versus permutation and our method
versus other algorithms.
3http://cran.r-project.org/
4https://archive.ics.uci.edu/ml/datasets/Ecoli
6.1 Inversion dimension versus permutation
For the three first data set (Iris, Ecoli, and Olive) data sets,
we find the global best permutation for each quality mea-
surements by searching over all permutations. The two last
data sets (Y14c and Wine), we find the local best permuta-
tion. We call two instances permutations of data dimension
if they are different by one consecutive position. The local
best permutation achieves the best quality over all neighbor
permutations.
Class Distance Consistency: Table 1 shows that the
quality of our approach is better than the CDC quality
in [21] for the Iris, Ecoli, Y14c, and Wine data sets and
is slightly lower than the CDC quality for the Olive data
set.
Cluster Density Measurement: Table 2 shows that the
CDM quality of our approach is better than the CDM qual-
ity in [2] for the two last data sets, lower for the Ecoli and
Olive data sets, and the same for the Iris data set.
Entropy Measurement: Table 3 shows that the Entropy
quality of our approach is better than the Entropy quality
in [18] for the Iris, Ecoli, and Y14c data sets, and is slightly
lower for the Olive and Wine data sets.
6.2 Inversion axes with other permutation
algorithms
In this section, we present the quality measure-
ment comparison of our method versus the t-statistic
method and the CDM method about the permuta-
tion on the Radviz [1]. The best permutation
in Radviz for the Wine data by t-statistic method
is {1, 2, 4, 8, 10, 11, 13, 12, 9, 7, 6, 5, 3}, and the CDM
method delivers {8, 3, 4, 2, 10, 13, 1, 5, 6, 7, 9, 12, 11}. The
best permutation in Radviz for the Olive data by t-statistic
method is {1, 2, 5, 4, 8, 7, 3, 6}, and the CDM method de-
livers {1, 3, 4, 7, 6, 2, 8}.
Table 4 shows the quality measurements CDC, CDM,
and Entropy (ENT) for the Olive and Wine data sets. The
overall quality measurements of our approach are better
than those of the t-statistic and CDM methods except for
the Entropy quality measure applied to the Wine data set.
Figure 9 (left) shows Radviz visualizing the Wine data
set with the best permutation by the t-statistic method and
Figure 9 (right) shows the Radviz visualizing the Wine data
set with the best permutation by the CDM method. In com-
parison, Figure 5 shows the Wine data set over the inversion
axes. The Figure 9 (left) shows lowest quality for class sep-
aration for the Wine data set, while Figure 5 (left) shows the
highest quality for class separation.
Figure 10 shows the Olive data set with the two best per-
mutations using the t-statistic method (left) and the CDM
method (right). Comparison with the inversion axes lay-
out is provided in Figure 7. Figure 7 (left) and Figure 10
166 Informatica 41 (2017) 159–168 V.L. Tran
Figure 7: The Olives Oil data. (Left) The best quality CDC on iRadviz. (Middle) The best quality CDM on Radviz.
(Right) The best quality Entropy on iRadviz.
Figure 8: The Ecoli data set. (Left) The best quality CDC on iRadviz. (Middle) The best quality CDM on iRadviz. (Right)
The best quality Entropy on iRadviz.
Figure 9: The Wine data. (Left) The best permutation by t-statistic method. (Right) The best permutation by CDM method.
Another Look at Radial Visualization for. . . Informatica 41 (2017) 159–168 167
CDC Iris Ecoli Olive Y14c Wine
Permutation 84.67% 67.56% 82.34% 93.96% 94.94%
iRadviz 94.00% 78.57% 80.24% 100% 96.63%
Table 1: The best CDC function over permutation and inversion axes.
Quality CDM Iris Ecoli Olive Oil Y14c Wine
Permutation 44.242 42.457 27.825 358.37 13.914
iRadviz 44.242 32.325 23.078 459.824 16.634
Table 2: The Best CDM function over permutation and inversion axes.
Entropy Iris Ecoli Olive Oil Y14c Wine
Permutation 0.1316 0.2057 0.1198 0.0648 0.0084
iRadviz 0.0028 0.1645 0.1281 0.000 0.0261
Table 3: The Best Entropy function over permutation and inversion axes.
Data Olive Wine
Method CDC ENT CDC ENT
t-statistic 55.95% 0.4090 75.28% 0.1643
CDM 76.57% 0.1826 88.87% 0.0176
Our method 80.02% 0.1281 96.63% 0.0261
Table 4: The quality measurement for the Olive and Wine data sets.
Figure 10: The Olives Oil data. (Left) The best permutation with CDC quality. (Right) The best permutation with Entropy
quality.
(right) have the lowest quality for class separation in the
visual space while Figure 7 (left and right) exhibits higher
quality for class separation for both permutations.
7 Conclusion
We have presented a new method for visualizing multidi-
mensional data based on Radial visualization. Our pro-
posed method supports users choosing a suitable view for
data sets in hypercube. We proved the effectiveness of
our method versus permutation dimensional anchors on the
Radviz for some supervised data both synthetic and real.
168 Informatica 41 (2017) 159–168 V.L. Tran
For future work, we want to improve our method to en-
hance class structures in subspaces with supervised data
sets. Moreover, we want to develop other quality measure-
ments for supervised data sets.
Acknowledgement
This research is funded by Vietnam National Foundation
for Science and Technology Development (NAFOSTED)
under grant number 102.01-2012.04.
References
[1] G. Albuquerque, M. Eisemann, D. J. Lehmann,
H. Theisel, and M. Magnor. Improving the visual
analysis of high-dimensional datasets using quality
measures. In IEEE Symposium on Visual Analytics
Science and Technology (VAST), 2010, pages 19–26,
2010.
[2] G. Albuquerque, M. Eisemann, D. J. Lehmann,
H. Theisel, and M. A. Magnor. Quality-based vi-
sualization matrices. In Proceedings of the Vision,
Modeling and Visualization Workshop 2009 (VMV),
Braunschweig, Germany, pages 341–350, 2009.
[3] M. Ankerst, S. Berchtold, and D. A. Keim. Simi-
larity clustering of dimensions for an enhanced visu-
alization of multidimensional data. In Proceedings
IEEE Symposium on Information Visualization (Info-
Vis ’98), 1998, pages 52–60, 1998.
[4] M. Ankerst, D. A. Keim, and H.-P. Kriegel. Circle
segments: A technique for visually exploring large
multidimensional data sets. Proceedings of the 1996
IEEE Symposium on Information Visualization, Hot
Topic Session, San Francisco, CA, 1996.
[5] A. O. Artero and M. C. F. de Oliveira. Viz3d: Ef-
fective exploratory visualization of large multidimen-
sional data sets. In The 17th Symposium on Com-
puter Graphics and Image Processing 2004, Brazil-
ian, pages 340–347, 2004.
[6] S. K. Card, J. D. Mackinlay, and B. Schneiderman.
Readings in information visualization: using vision
to think. Morgan Kaufmann, 1999.
[7] K. Daniels, G. Grinstein, A. Russell, and M. Glidden.
Properties of normalized radial visualizations. Infor-
mation Visualization, 11(4):273–300, 2012.
[8] L. di Caro, V. Frias-martinez, and E. Frias-martinez.
Analyzing the role of dimension arrangement for data
visualization in radviz. In Advances in Knowledge
Discovery and Data Mining, pages 125–132, 2010.
[9] J. Havrda and F. Charvát. Quantification method
of classification processes: Concept of structural α-
entropy. Kybernetika, 3(1):30–35, 1967.
[10] P. Hoffman, G. Grinstein, K. Marx, I. Grosse, and
E. Stanley. Dna visual and analytic data mining. In
Proceedings of the 8th conference on Visualization
1997, pages 437–441, 1997.
[11] P. Hoffman, G. Grinstein, and D. Pinkney. Dimen-
sional anchors: a graphic primitive for multidimen-
sional multivariate information visualizations. In Pro-
ceedings of the 1999 workshop on new paradigms in
information visualization, pages 9–16, 1999.
[12] J. Ian. Principal Component Analysis. Wiley Online
Library, 2005.
[13] A. Inselberg. The plane with parallel coordinates. The
Visual Computer, 1(2):69–91, 1985.
[14] E. Kandogan. Star coordinates: A multi-dimensional
visualization technique with uniform treatment of di-
mensions. In Proceedings of the IEEE Information Vi-
sualization Symposium 2000, volume 650, pages 4–8,
2000.
[15] D. A. Keim, M. Ankerst, and H.-P. Kriegel. Recur-
sive pattern: A technique for visualizing very large
amounts of data. In Proceedings of the 6th Confer-
ence on Visualization’95, pages 279–286, 1995.
[16] G. Leban, B. Zupan, G. Vidmar, and I. Bratko.
Vizrank: Data visualization guided by machine learn-
ing. In Data Mining and Knowledge Discovery 13,
pages 119–136, 2006.
[17] D. J. Lehmann and H. Theisel. Orthographic star co-
ordinates. IEEE Transactions on Visualization and
Computer Graphics, 19(12):2615–2624, 2013.
[18] X. Li, K. Zhang, and T. Jiang. Minimum entropy
clustering and applications to gene expression anal-
ysis. In Computational Systems Bioinformatics Con-
ference, (CSB 2004), pages 142–151, 2004.
[19] J. McCarthy, K. Marx, P. Hoffman, A. Gee, P. O’Neil,
M. Ujwal, and J. Hotchkiss. Applications of machine
learning and high-dimensional visualization in can-
cer detection, diagnosis, and management. Annals of
the New York Academy of Sciences, 1020(1):239–262,
2004.
[20] M. Rubio-Sanchez and A. Sanchez. Axis calibration
for improving data attribute estimation in star coordi-
nates plots. IEEE Transactions on Visualization and
Computer Graphics, 20(12):2013–2022, Dec. 2014.
[21] M. Sips, B. Neubert, J. P. Lewis, and P. Hanrahan.
Selecting good views of high-dimensional data us-
ing class consistency. Computer Graphics Forum,
28(3):831–838, 2009.
[22] T. Van Long and L. Linsen. Visualizing high density
clusters in multidimensional data using optimized star
coordinates. Computational Statistics, 26(4):655–
678, 2011.
Informatica 41 (2017) 169–182 169
Emotional Contagion Model for Group Evacuation Simulation
Xuan-Hien Ta
Toulouse University, UPS-IRIT, Toulouse, France E-mail: hientpbk@gmail.com
Benoit Gaudou
Toulouse University, UT1C-IRIT, Toulouse, France E-mail: benoit.gaudou@gmail.com
Dominique Longin
Toulouse University, CNRS-IRIT, Toulouse, France E-mail: Dominique.Longin@irit.fr
Tuong Vinh Ho
Vietnam National University, Hanoi, Vietnam
E-mail: vinhht@vnu.edu.vn
Keywords: emotion, simulation, agent-based model, GAMA platform, crisis situation, evacuation process
Received: March 29, 2017
The key role of emotions in decision-making process of human beings has been highlighted recently. Our
research focuses on fear-related emotions and their positive impact on the survival capabilities of human
beings in case of crisis situations. In this paper, we proposed a new model of emotional contagion based
on some main findings in social psychology. This model was formalized mathematically, implemented and
tested in the GAMA agent-based simulation platform in the context of evacuation simulation. We assessed
experimentally the impact of three factors (emotion decay, environment, neighbors’ emotional contagion)
on emotion dynamics at individual and group levels. The experimental results allow us to understand the
emotional contagion of agent group in several scenarios. The proposed model will help us to better study
the impact of emotional contagion on evacuation safety in evacuation simulation.The entire theoretical
model has been implemented in the simulation platform GAMA.
Povzetek: Prispevek analizira čustva, povezana s strahom na primeru evakuacije.
1 Introduction
Emotions, these reflexes that push human beings to make
decisions quickly and without a deep and clear reasoning
process, have been considered for a long time contrary to
any other rational reasoning processes. Only recently the
key role of emotions in decision-making process has been
highlighted. We focus on fear-related emotions and their
positive impact on the survival capabilities of human be-
ings in case of crisis situations. Indeed, recent works have
shown that emotion is a very important factor in the un-
derstanding of human beings behaviours in crisis situations
(see [9, 10, 28, 4] for instance).
It has been studied for a long time in psychology and
in philosophy, and more recently in cognitive sciences (see
[27, 21, 31, 12] for instance). These works have shown the
narrow relationship existing between an emotional state in
a person and the action tendencies of this person. Indeed,
emotions play a central role in cognition, especially when
we need to react very quickly (what is the case in crisis
situation). Instantaneously, emotions provide us a set of
possible actions (called action tendencies for Lazarus [21])
that are strongly related to the situation. An emotion can
be viewed as a summary of the situation, how this situation
can affect ourselves, and what power we have on the real
world in the aim to change the present situation in a positive
one for us. So, emotions have a great power of explanation
of our actions in crisis situations.
In crisis situations, the most remarkable expression of
the fear is definitely panic behaviors. While early re-
searches on panic have presented panic as groundless fear
or flight behavior, others describe it as a crowd in dissolu-
tion. Nevertheless, in situation such as fire or disaster, [26]
has shown that it is in fact a very meaningful behaviour
and far from most conceptions of irrationality. The panic
behaviour exists but is in fact quite rare. It is an individual
behaviour, by opposition to a behaviour of the crowd, it is
not contagious and occurs in short duration. It is not easy
to be observed in crisis situations.
Some particular conditions of panic triggering have been
identified such as: perception of a great threat to self, a be-
lief that escapes from the threat is possible but is very hard
to achieve, and a feeling of helplessness [28, 14]. Some
additional factors may also have an influence on triggered
emotions such as experience in emergency situation and
information. Information is the key to make a successful
evacuation strategy during a crisis [29]. The sex and age of
an individual can cause a different fear level.
In addition, as it has been shown in [28], panic is not the
predominant emotion in crisis situation. A lot of reports
170 Informatica 41 (2017) 169–182 X.-H. Ta et al.
(see [11] for instance) show that when the danger increases,
the mutual aid between humans exposed to this danger also
increases. The persons share emotions and information,
and they help each other, even if they were strangers each
other before. There is a very few cases of selfish. One of
the faces of this mutual aid is the constitution of groups of
persons. People in a group of friends or in a family try to
stay together every time it is possible. Sociological studies
show that groups increase our chances to be saved [9] (evo-
lutionary condition). In our previous work [32], we have
studied the impact of group on the evacuation process. In
this paper, we focus only on emotion contagion.
In the simulation area, a lot of works focus more specif-
ically on emotion contagion. For instance, in [24], the au-
thors present simulations about relationships between emo-
tions, information and beliefs. All members of a group can
absorb the emotion of other members (in the same group)
to create an average value of emotion. But they can also be
influenced by the members of other groups. In this case,
the average emotion of the group can be increased (am-
plification) or decreased (absorption). We can understand
the absorption of emotions as a bottom-up approach, and
the amplification of emotion as a top-down approach. The
authors propose the idea that agents with a high emotion
(above a high threshold) or a low emotion (under a low
threshold) will impact with different roles (increase or de-
crease) depending on the characters of agent like the open-
ness, the expressiveness, the capacity of receiving or ex-
pressing from/to others. Similarly, in [5], the authors give
another interesting orientation about the contagion of emo-
tion among a group.
In the GAMA agent-based simulation community [33,
17], several models (see [25, 22] for instance) have shown
the important role played by emotions in emergency situa-
tions. In [25], authors simulate the emotion dynamics in a
group. They give a new operational model of the emotion
contagion and implement the process of evacuation (avoid-
ing both obstacles and the other agents). They evaluate the
model with respect to the time of evacuation by applying
many criteria. When the emotion intensity changes, the
walking speed of the corresponding agents also changes
and impacts the evacuation time. But we can also criticize
here the fact that the emotion modeling is still very basic:
we need a more complex cognitive model of emotions if
we want to simulate agent behaviors as natural as possible.
This article provides a new model of emotions dynamics.
We focus here only on fear because this emotion plays an
important role in crisis situations. We propose to model
the emotion following three main findings both in cognitive
psychology and in social psychology:
1. Emotions have triggering conditions (see [27, 21] for
instance): this is a cognitive appraisal of these condi-
tions that determines if they are fulfilled or not1. Fol-
lowing these authors, fear is triggered when we per-
1By this assumption, we suppose here that emotion is in cognition:
this is the point of view of the great majority of psychology community
ceive a danger for our own life. Here, perception can
be direct (an agent sees a fire or hears an alarm) or
indirect (some other agents having fear influence the
fear level of this agent).
2. Emotion intensity decreases with time: when trigger-
ing conditions are not longer satisfied, an emotion
does not disappear instantaneously (it is a process that
takes time).
3. Finally, new perceptions from the environment (fires,
alarms, influence of others) can modify the intensity
level of fear that can increase or decrease.
As far as we know, there is no model that takes into account
all these factors in an intuitive manner. More precisely, a lot
of factors may impact the emotion, but here we only take
into account three main ones: environment (crisis percep-
tion), emotional decay and contagion. The emotion model
is implemented in GAMA 2 and is a part of a project about
evacuation simulations in crisis situations.
This paper is organized as follows. We first describe the
model of emotion dynamics in Section 2. In Section 3, we
assess the impact of the three factors (emotion decay and
contagion, environment) on the emotion dynamics. Then
we conduct the sensitivity analysis of the emotion model
in Section 4. Finally, we conclude our work with some
perspectives.
2 Model of emotion dynamics
2.1 Agent structure
As presented above, this article focuses only on one emo-
tion and its diffusion. So, the environment is described in a
simple manner. In particular, there is neither obstacles nor
exit doors (because both of them do not have any impact
on our results). It will only contain some fire and human
agents.
Let AGT = {i, j, k, ...} be the finite set of human
agents used in the simulation, FIRE = {f1, f2, ...} the
finite set of fires and TIME = {t0, t1, ...} the finite set
of time points where t0 is the initial state of the sim-
ulation. The set of all the entities of the simulation is
ENT = AGT ∪ FIRE . We denote by card(E) the car-
dinality of the set E. So, card(AGT ) for instance is the
number of agents and tcard(TIME)−1 is the final state of
the simulation.
Each agent i at time t is characterized by the 6-tuple
〈posi, visualRadiusi, neighbRadiusi, emDecayCoeff i,
fireInflCoeff i, agtInflCoeff i〉 where:
(see [21, 27, 12, 31] for instance) and this view is called “cognitive theory
of emotion”.
2GAMA is a (open-source) generic agent-based modeling and simula-
tion platform. It provides a lot of powerful tools to develop easily agent-
based models, in particular using geographical data. In addition, GAMA
allows the modeler to run simulation in either an interactive or a batch
mode. This will allow us to launch experiment design in order to explore
the model.
Emotional Contagion Model for Group. . . Informatica 41 (2017) 169–182 171
– posi : TIME −→ R × R is the function that maps,
for each time point t, the position posi(t) of agent i
at time t. We extend this function to any entity e ∈
ENT .
– visualRadiusi : TIME −→ R is the function
that maps, for each time point t, the visual radius
visualRadiusi(t) of i at time t.
We consider here that each agent has its own percep-
tion radius and that this perception radius can change
during the evacuation process (because of smoke, fire,
obstacle, etc.).
In some scenarios, we suppose that the value d of vi-
sual radius does not change over time and we note
visualRadiusi = d.
– neighbRadiusi : TIME −→ R is the function that
maps, for every time point t, the neighborhood ra-
dius neighbRadiusi(t) of i at time t. We impose
that neighbRadiusi(t) ≤ visualRadiusi(t) for every
agent i and time point t.
In some scenarios, we suppose that the value d of
neighborhood radius does not change over time and
we note neighbRadiusi = d.
– emDecayCoeff i ∈ [0, 1] is the decay coefficient of i’s
emotion intensity (see Section 2.2). From a psycho-
logical point of view, agents are more impressionable
than others. It depends on personologic data [11] and
we suppose here it does not change over time.
– fireInflCoeff i ∈ [0, 1] is the fire influence coefficient
on i. Due to the fact that some agents can be more ex-
perienced in some dangers (as fire, for instance) than
other agents, the impact of a given danger depends on
the agent who faces this danger. The more an agent
is experienced in a danger, the less its fire influence
coefficient is high.
– agtInflCoeff i : AGT −→ [0, 1] maps for ev-
ery agent j ∈ AGT , the coefficient of influence
agtInflCoeff i(j) of agent j on i. It is well-known
in social influence literature (see [19, 15] for instance)
that we are influenced by others from the point of view
of beliefs, desires, norms, etc.
It is the same with emotional states. But, due to the
personality of each person, one can be more or less
influenced by others. This coefficient of influence
agtInflCoeff i(j) takes into account this aspect and
the more this coefficient is high, the more agent i is
influenced by the point of view of agents j.
So, we are able to define the following abbreviations (for
every e, e′ ∈ ENT , t ∈ TIME and i ∈ AGT ):
distance(e, e′, t)
def
= ||
−−−−−−−−−−→
pose(t)pose′(t)||
detectedFiresi(t)
def
=
{
f ∈ FIRE :
distance(i, f, t) ≤ visualRadiusi(t)
}
minDistFiresi(t)
def
= min
({
distance(i, f, t) :
∀f ∈ detectedFiresi(t)
})
Ni(t)
def
=
{
j ∈ AGT :
distance(i, j, t) ≤ neighbRadiusi(t)
}
distance(e, e′, t) is the distance between the positions of
entity e and entity e′ at time t.
detectedFiresi(t) is the set of fires in the visual radius
of agent i at time t.
minDistFiresi(t) is the minimal distance between agent
i and all the fires it perceives at time t. We suppose here
that, the more a fire is close to us, the more we are afraid by
it. So, for the sake of simplicity, we suppose that the emo-
tional reaction with respect to distant dangers is subsumed
by the emotional reaction with respect to the closest dan-
ger(s) that we perceive. So, only the closest fires are taken
into account here.
Ni(t) is the function that maps, for each time point t, the
set of neighbors of agent i at time t.
Finally, we will define in the next section the function
fear i(t) that computes the fear level of the agent i for each
time point t. At the initial time t0, fear i(t0) is fixed for
each agent i. The fear level at time t > t0 is computed
dynamically during the simulation steps.
More precisely, the fear intensity change from time t−1
to time t (that is, the change from fear i(t− 1) to fear i(t))
is a three steps process depending on three different suc-
cessive functions:
1. ∆fearDecay i(t) describes the lost of emotion inten-
sity from t − 1 to t due to time. If fear i(t − 1) =
0 (that is, the fear level at time t − 1 is 0), then
∆fearDecay i(t) = 0; else, ∆fearDecay i(t) is the
value that correspond to the lost of emotion intensity
between t− 1 and t (see Section 2.2);
2. ∆fearEnv i(t): if the current fear level after decay
is equal to 0 then a value (computed from a sigmoid
function) is returned, else the variation of the fear be-
tween t−1 and t is added. This variation is computed
from the derivative of the sigmoide between t− 1 and
t and corresponds to the effect of the fires that agent
i detects around itself (if fires are detected) on its fear
level (see Section 2.3);
3. ∆fearNeighbi(t): it is the variation of the fear (that
can be positive or negative) coming from the influence
of i’s neighbors. If these neighbors have a fear level
172 Informatica 41 (2017) 169–182 X.-H. Ta et al.
that is lower than the fear level of i (after decay and
influence of the environment), then the fear level of i
will decrease, else it will increase (see Section 2.4).
Finally, fear i(t) is the final new value of fear intensity at
time t. It is defined as a composition of the above three
components. Note that we could compute the fear level as
the sum of three independent functions: one for the decay
process, one for the environment influence process, and one
for the neighborhood influence process. But such a sum
could be less than 0 or to be greater than 1 (whereas we
require that fear level is between 0 and 1). So, we prefer to
compute the resulting emotion intensity as a composition of
functions because it avoid such situations where the results
could not be between 0 and 1.
2.2 Emotion Decay over Time
As highlights in the literature [27, Chap. 4], without any
stimulus, agents’ fear intensity will decrease over time.
This decay is often described as faster for higher values
of emotion intensity, and it slows down when the emotion
intensity is low.
At time t and for every agent i ∈ AGT , the value
of the fear decay (the loss of emotion intensity) is noted
∆fearDecay i(t). This value is a function of the previ-
ous emotion level at time t − 1 (fear i(t − 1)) and of
emDecayCoeff i ∈ [0, 1] (the decay coefficient that de-
pends on some attributes of each agent like genre, age, sex,
etc. [11]). Moreover, we suppose that this decay coefficient
does not vary over time.
These requirements lead us to use the following function
for emotion decay over time (see Figure 1):
∆fearDecay i(t)
def
=
− emDecayCoeff i × fear i(t− 1)
(1)
We can first notice that, if fear i(t − 1) = 0 (e.g. at the
simulation initial step) then ∆fearDecay i(t) = 0 and then,
fear i(t) (the emotion level at time t) will not be modified
by (1). So it does not trigger any emotion, but only de-
creases its value with time.
Moreover, the more emDecayCoeff i is great, the more
emotional level decreases quickly.
Finally, note that the emotion decay has the same shape
as the “activation level decreasing” in the Anderson’s the-
ory of central cognition [3]. It could certainly be oversubtle
but this form has the advantage to be computationally in-
teresting.
In Figure 1, the fear function is limited to the fear de-
cay effect (what we call fearDecay i(t)), so its evolution is
described by
fearDecay i(t) = fear i(t− 1) + ∆fearDecay i(t).
Figure 1: Fear decay with emDecayCoeff i = 0.02 for any
agent i and without any other stimulus.
2.3 Environment Influence on Emotion
The environment contains dangers (fires for instance),
warnings (alarm...) or other elements (smoke...) that may
have an impact on emotions. In particular, dangers may
trigger a fear emotion or increase the fear intensity.
In the following, we consider two distinct processes: a)
emotion is triggered when the agent does not feel fear yet
and b) the fear level is updated when an agent is already
feeling fear and has to face a hazard.
Emotion triggering (when fearDecay i(t) = 0). When
agent i does not feel fear at time t just after the emotion
decay computation (fearDecay i(t) = 0) and perceives a
hazard or hears an alarm, this stimulus appraisal will trigger
an emotion. We make the assumption that both the distance
to the danger and the number of dangerous elements the
agent has perceived influence the intensity of the triggered
emotion.
The fear degree function should be an increasing func-
tion of the number of hazards, but a logarithm-like function
to capture the fact that the difference in terms of intensity
is greater if the agent observes a small number of fires (for
instance, 2 fires instead of 1) rather than if it observes a
huge number (for instance, 102 fires instead of 101). In
addition, we consider that the intensity should also be a de-
creasing function of the distance to hazard and we assume
that the relevant distance minDistFiresi(t) at time t from
agent i to hazards is here the distance to the closest hazard
and not the average distance to all fires in i’s neighborhood
(see Section 2.1).
As a consequence, emotion triggering when fires occur
in the perception radius visualRadiusi(t) of agent i at time
t is formalized as follows. When fearDecay i(t) = 0, we
define the intensity of the triggered fear by:
∆fearEnv i(t)
def
=
1
1 + e
−λi(1−
minDistFiresi(t)
visualRadiusi(t)
)
(2)
Clearly, ∆fearEnv i(t) is a sigmoid function where λi
characterizes the steepness of the curve. λi should increase
Emotional Contagion Model for Group. . . Informatica 41 (2017) 169–182 173
together with the number of fires in the i’s perception
area at time t (that is, formally, card(detectedFiresi(t)))
and it also depends on the fire influence on agent i
(fireInflCoeff i). So:
λi
def
= fireInflCoeff i ×(
1− 1
card(detectedFiresi(t)) + 1
) (3)
Note that fireInflCoeff i could depend on the knowledge
about and the experience with fire of i [23].
Figure 2 illustrates the impact of the number of fires and
of their distance on the initial fear level.
Note that (2) ensures that ∆fearEnv i(t) ∈ [0, 1]. We
have chosen here a sigmoid function because this type of
function illustrates perfectly the switch between a low level
of the fear intensity3 and the triggering of fear. We use
here a particular steepness λi that must be easily changed,
depending of the experimental situation.
In Figure 24, fear at time t is computed only from the en-
vironment influence (neither emotion decay is applied nor
neighbors influence). It is supposed here that the more
time increases, the more fires number decreases. Sev-
eral simulations have been executed, corresponding to sev-
eral minimal distances between agent i and fires (that is:
minDistFiresi(t) ∈ {0.0, 5.0, 10.0, · · · , 40.0}). So, its
evolution is described by
fearEnv i(t) = ∆fearEnv i(t).
Note that the more minDistFiresi(t) is low, the more the
intensity of fear is high when the number of fires is maxi-
mal.
Emotion update (when fearDecay i(t) > 0). When
fearDecay i(t) > 0, fear has already been triggered and we
assume that the perception of fires must change this previ-
ous fear level. So, we use the derivative (4) of the previous
sigmoid described in (2) to update step by step the emotion
level.
For convenience’ sake, let be
λ′i
def
= λi ×
(
1− minDistFiresi(t)
visualRadiusi(t)
)
.
So, ∆fearEnv i(t) is just the variation of fear following
from the environment influence on the emotion level at time
t. That is:
∆fearEnv i(t)
def
=
fearDecay i(t).(1− fearDecay i(t)).λ′i
(4)
when 0 < fearDecay i(t) < 1
3By low level, we means a level that is under the triggering threshold
of fear.
4The numerical values chosen in this section have been chosen with a
case study of the size of a supermarket in mind. For the other coefficients,
they have been chosen in order that results to be good illustration of the
equations. The exploration of the various values of parameters is provided
in the Section 4
∆fearEnv i(t) is here the variation of i’s fear level at
time t after the influence of the environment on the emotion
level (without taking into account the emotion decay).
Figure 3 presents the evolution of the fear level under
the single influence of the environment (fire).The fear evo-
lution is thus described by the equation:
fearEnv i(t) = fearDecay i(t) + ∆fearEnv i(t).
2.4 The Neighbors’ Emotional Contagion
The two previous subsections focused on the individual
part of the emotion. We consider here its social aspect:
emotions can spread among neighbors. This has already
been investigated in many works, such as [13, 5] where the
emotion of an agent tends to the average value of all the
agents over time (as in our model).
In our model, an agent detects its neighbors at time t
based on its visual radius (see Ni(t) in Section 2.1). So,
the emotional influence of agent j on agent i at time t is
the difference between the emotion level of i and the emo-
tion level of j at time t. This influence is weighted by the
influence coefficient agtInflCoeff i(j) of j on i. So, for-
mally:
InfluenceOf j i(t)
def
=
(
fear j(t− 1)−
fearEnv i(t)
)
× agtInflCoeff i(j)
(5)
agtInflCoeff i(j) depends on the relationship between i
and j: stronger theses relationships are, higher this value
is. This equation is based on the bounded confidence model
of [18]. Some equations have been proposed in the social
network analysis area (see [7, 20, 19, 16, 30] for instance)
corresponding to the modelling of different situations.
Note that if fear j(t − 1) > fearEnv i(t) then
InfluenceOf j i(t) > 0: it means that the fear level of
i will increase. Conversely, if fear j(t− 1) < fearEnv i(t)
the i’s fear level will decrease. If the levels are the same, it
means that i is not influenced by j (InfluenceOf j i(t) =
0).
So, we are now able to compute the influence of all the
i’ neighbors that is the average value of all the individual
influences:
∆fearNeighbi(t)
def
=
1
card(Ni(t))
∑
j∈Ni(t)
InfluenceOf j i(t)
(6)
Note that the influence of neighbors is computed as the av-
erage value of each neighbor.
Without the decay and without the environment influ-
ence, the emotion of all simulated agents reaches average
values as illustrated in Figure 4. It corresponds to the fol-
lowing equation:
fearNeighbi(t) = fearEnv i(t) + ∆fearNeighbi(t)
Depending on agtInflCoeff i(j) for every neighbor j of
i, the time to reach this equilibrium can be different.
174 Informatica 41 (2017) 169–182 X.-H. Ta et al.
Figure 2: Fire number and distance impact on the emotion level (with visualRadiusi = 40,fireInflCoeff i = 1).
Figure 3: Emotional level dynamics only influenced by
environment (emDecayCoeff i = 0) with (for every
i ∈ AGT and t ∈ TIME ): fireInflCoeff i = 0.1,
card(detectedFiresi(t)) = 2, minDistFiresi(t) = 10,
visualRadiusi(t) = 40 and with fear i(t0) = 0.05.
2.5 The Emotion Level Global Equation
The new emotion level of agent i at time t, after the decay
due to time (see Section 2.2), the influence of the environ-
ment (see Section 2.3), and the influence of i’s neighbors
(see Section 2.4) is nothing else that:
fear it = fearNeighbi(t) (7)
(It is due to the fact that we have chosen to compute fear at
time t as a composition of functions.)
2.6 Additional Influences of the
Environment on Emotion
Some other factors may impact agents’ emotions in differ-
ent manners. For instance, the influence of smoke is similar
Figure 4: Fear level dynamics of every agent i un-
der the only influence of emotion contagion process
(emDecayCoeff i = 0 and fireInflCoeff i = 0), with
agtInflCoeff i(j) = 0.02 (for every neighbor j of i),
card(AGT ) = 10. The inital fear value is chosen ran-
domly in [0, 1].
Emotional Contagion Model for Group. . . Informatica 41 (2017) 169–182 175
Figure 5: Emotion evolution of all the agents under the only
effect of emotional contagion.
to the fire one but the impact coefficient can be different.
The influence of alarm does not depend on the distance as
we could suppose that all people could hear the alarm.
Finally, we can also mention as additional factors influ-
encing agents’ emotions: the fear reduction due to a secu-
rity agent, the impact of the perception of an exit door, or
the impact of the help received from others.
3 Experiments on the emotion
dynamics
In this section, we assess the impact of various possible
combinations of the three factors (emotion decay, conta-
gion and environment) on the emotion dynamics. We first
only investigate the emotion dynamics and then couple it
with a second dynamics: agents’ moves. (Note that in the
following, i’s visual radius does not change over time and
we note it: visualRadiusi.)
3.1 Emotion Dynamics with Unmoving
Agents
The following results are computed with card(AGT ) =
20 and card(FIRE ) = 10 and with the follow-
ing values of agent parameters (for every agent i ∈
AGT ): emDecayCoeff i = 0.02, fireInflCoeff i = 0.1,
agtInflCoeff i(j) = 0.04 for every j ∈ Ni(t) and every
t ∈ TIME , and visualRadiusi = 40. Neither the agents
nor the fires move.
3.1.1 Emotional Contagion
In these simulations, we first check the impact of the ran-
dom distribution of agents in the environment on the con-
tagion. As they have a limited perception radius, agents are
not able to diffuse their emotion to all other agents. We
initialize agents’ fear level to a random value in [0, 1]. The
result is presented in Figure 5. We observe that the agents’
emotion tends towards a limited number of values. Each
Figure 6: Emotion evolution of all agents under both the
decay and the contagion effects.
of these values correspond to a spatially clustered set of
agents.
This convergence state with several stable values be-
comes quite common in the related field of social opin-
ion dynamics. In particular, [8] has proposed the bounded
confidence model that uses continuous opinion value and
an acceptability threshold. When two agents (represent-
ing individuals moving in an abstract environment) meet
each other they share their opinions. If they are not too
far (distance in terms of opinion below a given threshold),
opinions are altered in order to come closer. Depending on
parameters (interaction frequency, initial opinion distribu-
tion, or even interaction network topology), various kinds
of convergence can appear: either convergence to an inter-
mediate consensus or to one or two extremist opinions. In
our case, we recognize basically the same pattern, the ac-
ceptability threshold of [8] is for us the perception radius
that will limit the agents that can interact together.
3.1.2 Coupling Emotion Decay and Contagion
As we do not take into account the process triggering
emotions from environment stimuli, we initialize randomly
fear i(t0) ∈ [0, 1] for every agent i ∈ AGT and test the in-
fluence of the two decay and contagion factors.
The result is presented in Figure 6. With no influence
of fires, the fear level of each agent i converges (due to
the emotional contagion) and tends towards 0 (due to the
decay). Nevertheless we can notice that even without stim-
ulus, the fear level of some agents starts increasing due to
the contagion dynamics before finally decreasing when the
decay becomes the dynamics that have the greatest influ-
ence on the system.
3.1.3 Coupling Emotion Decay and Environment
Let be fear i(t0) = 0 for every i ∈ AGT . The emotion
will be triggered by the perception of fires. The result is
presented in Figure 7. We first observe that fear level of
some agents keep or tend towards 0, because they can not
perceive any fire. The main observation is that fear i(t)
176 Informatica 41 (2017) 169–182 X.-H. Ta et al.
Figure 7: Emotion evolution of all the agents under both
the decay and the environment effects.
Figure 8: Emotion evolution of all agents under both the
environment and the contagion effects.
reaches a stable value for each agent i when t increases.
This value depends on the number of fires and the distance
to them. This shows that the simulation reaches an equi-
librium between the two processes influencing the emotion
dynamics. In addition the stable value is always smaller
than the maximum value due to the effect of the decay.
3.1.4 Coupling Environment and Emotion Contagion
Again we conside the situation where fear i(t0) = 0 for ev-
ery i ∈ AGT and the emotion will be triggered by fires in
the environment. We consider in this case the coupling be-
tween the emotion triggered by fire and the emotion conta-
gion among agents. The result is plotted in Figure 8. With-
out emotion decay, agents fear tends to reach the maximal
value (i.e. 1). Time to reach it depends on the distance to
fires and the number of neighbours. Nevertheless we can
again observe a stability of the results.
In addition, due to emotional contagion over agents, no
agent has its fear level staying at the value 0. Even agents
that cannot perceive the danger start to feel fear because of
their neighbors.
Figure 9: Emotion evolution of all the agents under the
decay, the environment and the contagion effects.
Figure 10: Impact of all the factors (decay, environment,
contagion) on the emotion intensity in case of moving
agents.
3.1.5 Coupling Emotion Decay, Environment and
Emotion Contagion
Finally we couple the three processes in a single model.
Figure 9 displays the results. The results show again that
fear levels tend to a stable value. This value is obviously
lower than the value obtained without decay (see Figure 8).
But it is interesting to note that the fear level values are
also lower than the ones in the case without contagion (see
Figure 7). The contagion process indeed drives fear level
values to the average value which induces a decrease of the
maximum value.
3.2 Emotion Dynamics with Moving Agents
The previous results come from simulations with static
agents and environment, providing, as expected, stable re-
sults. In this section we will introduce agents mobility. We
launch the simulations in the same conditions as the pre-
vious ones, except that we have 10 agents. Agents move
randomly in the environment: they pick a random target in
the environment, move to it and when they reached it they
choose a new one. Figure 10 displays each agent emotion
evolution.
We can observe that the results are not stable anymore.
Emotional Contagion Model for Group. . . Informatica 41 (2017) 169–182 177
Figure 11: Impact of only the emotion contagion on the
emotion intensity in case of moving agents.
Indeed as the agents can move they will be sometimes close
to fires, increasing their level fear, and sometimes far from
them, decreasing their fear level.
If we activate only the emotional contagion, we observe
in the Figure 11 with moving agents that each agent fear
level converges toward the same value. Contrarily to the
results in Figure 5, we can observe here a convergence hav-
ing moving agents removes the cluster effect that can occur
when agents do not move.
4 Sensitivity analysis
In this section, we explore the model behavior with respect
to parameters variations. We only focus here on the three
following coefficients for a given agent i: emDecayCoeff i,
fireInflCoeff i and agtInflCoeff i, that characterize the
three processes making emotion dynamic during the simu-
lation. So, we will measure the maximum, minimum, aver-
age and standard deviation values of the agents’ fear level
at the end of the simulations. In addition we will com-
pare results between two cases: with and without moving
agents.
We initialize simulations with card(AGT ) = 50,
card(FIRE ) = 10, randomly located. For each parame-
ters tuple
〈emDecayCoeff i,fireInflCoeff i, agtInflCoeff i〉
(where i ∈ AGT ) we run 10 simulations and measure the
maximum, the minimum, the average and standard devia-
tion values of the agent fear level at the step number 100.
When agents can move, they choose a random target, go
to it and when reached the target it picks randomly a new
target.
Figure 12: Impact of emDecayCoeff i (for every i ∈
AGT ) on the fear level of moving agents in case of
fireInflCoeff i = 0.05 and agtInflCoeff i = 0.01.
4.1 Exploration in the Case of Moving
Agents
4.1.1 Exploration of the Impact of the Decay
Coefficient emDecayCoeff i
For every agent i ∈ AGT , let fireInflCoeff i =
0.05, agtInflCoeff i = 0.01 and emDecayCoeff i ∈
{0.01, 0.02, 0.03, 0.04, 0.06}. We measure the 4 indicators
presented above and denoted them max, min, mean and
standard deviation. We observe the results in Figure 12.
We can observe that when emDecayCoeff i increases, the
fear level tends toward 0. This means that when the decay
coefficient is more important, the decay process has more
influence on the simulation results.
4.1.2 Exploration of the Impact of all the Parameters
The previous Section 4.1.1 shows the impact of the
emDecayCoeff i parameter single-variation on the fear
level. We launch now an exhaustive exploration of the
model with (for every agent i ∈ AGT ):
– emDecayCoeff i ∈ {0.01, 0.02, 0.03, 0.04, 0.06}
– fireInflCoeff i ∈ {0.05, 0.1, 0.2, 0.3, 0.5}
– agtInflCoeff i ∈ {0.01, 0.06, 0.1, 0.2, 0.3}
For each parameter tuple
〈emDecayCoeff i,fireInflCoeff i, agtInflCoeff i〉
we launched 10 simulations and store the average value of
each indicator. The complete results are summarized in
Figure 13 and Figure 14.
These figures display the scatter plots of all possible
pairs of parameters and indicators. For example in Fig-
ure 13, the upper-right frame plots the max indicator with
relation to the emDecayCoeff i parameter
5. All the bullets
5This has been plotted using the R software: https://www.
r-project.org/
178 Informatica 41 (2017) 169–182 X.-H. Ta et al.
Figure 13: For every agent i ∈ AGT , max indicator depending on emDecayCoeff i, fireInflCoeff i and agtInflCoeff i
values.
Figure 14: max, min, mean and standard deviation values depending on emDecayCoeff i, fireInflCoeff i and
agtInflCoeff i values for every agent i ∈ AGT .
Emotional Contagion Model for Group. . . Informatica 41 (2017) 169–182 179
Figure 15: Impact of emDecayCoeff i on the fear level
of unmoving agents when fireInflCoeff i = 0.05 and
agtInflCoeff i = 0.01 (for every agent i ∈ AGT ).
correspond to the projection of tuples
〈emDecayCoeff i,fireInflCoeff i, agtInflCoeff i,max〉
(for every i ∈ AGT ) in a 2 dimensions plan. This repre-
sentation allows the modeler to isolate the influence of one
single parameter evolution on one single indicator.
In addition, still looking at the upper-right frame, we can
observe possible values of the emDecayCoeff i parameters
on the right and the value range of the max indicator on the
top.
We can thus observe that (for every i ∈ AGT )
fireInflCoeff i has a huge influence on the max indicator:
when fireInflCoeff i is high (0.5) the maximum fear lev-
els are also very high (between 0.7 and 1). And this re-
sult is independent from the other parameter values. When
fireInflCoeff i is low (0.01 and 0.02) the maximum is lower
and close to 0.
Similarly we can observe that the emDecayCoeff i pa-
rameters have an effect on the boundaries of the max in-
dicator: for every i ∈ AGT , when emDecayCoeff i is
high, the maximum of the max indicator is limited to 0.8
whereas, with the lowest value of this coefficient, the limit
is around 1, and many plots are concentrated around this
value. We can notice that for intermediate values of the
emDecayCoeff i coefficient, plots are concentrated around
0.0 and 0.8. We thus have a polarization of the results
around two main values, corresponding to the minimum
and maximum values that the max can take.
We can also observe that agtInflCoeff i does not have
a visible impact on the max indicator: with high or low
values of this coefficient, the max indicator takes values
everywhere in [0, 1].
Looking at Figure 14, we can also notice that
fireInflCoeff i has a smaller influence on the min indica-
tor, but emDecayCoeff i has a higher one. In particular,
when emDecayCoeff i increases the min indicator takes
lower values.
It is also interesting to notice that, when we consider
emDecayCoeff i, the distributions of min and mean plots
are very close, whereas when we consider fireInflCoeff i,
max and mean plot distributions are close (and different
from the min distribution). This means that, in average,
plots are closer the min (resp. the min) plot distribution.
Finally we can observe that, even if the agtInflCoeff i
does not have a significant influence on the max and mean
indicators, it tends to reduce the standard deviation. That
means that the emotional contagion tends to level fear level
values.
4.2 Exploration in the Case of Unmoving
Agents
We run simulations with the same initial conditions as in
the previous section but agents don’t move now. The re-
sults are quite similar to the results in case of moving agent
(Figure 15).
This is due to the high number of agents and the chosen
visual radius (visualRadiusi = 40 for every i ∈ AGT ).
We continue to expand this experiment by changing
emDecayCoeff i, fireInflCoeff i and agtInflCoeff i(j) (for
every agent i and every j ∈ Ni(t)). The comparison is
presented in Figure 16(a), Figure 16(b), Figure 16(c) and
Figure 16(d).
We can observe that there is only a small difference in the
emotion level values between both cases. It seems that the
emotion of agent in these cases do not depend on moving
or unmoving agents.
It can be explained by the higher value of the visual ra-
dius: an agent can detect more agents, so it will be influ-
enced by more of them. Evidently, an agent moving has
more opportunity to detect the others. But with a large vi-
sual radius, there is not much difference between 2 types
of agent. And one thing important, we don’t account into
the influence of neighbours, therefore the distance between
agents when they move, does not play an important role.
Nevertheless we go a little deeper in the comparison be-
tween simulations with moving and unmoving agents. We
aim at evaluating the time for fear levels to converge under
the influence of the emotional contagion process only and
the influence of agtInflCoeff i(j) (for every j ∈ Ni(t)) on
the convergence.
We run simulations and stop them when the standard de-
viation indicator becomes lower than 0.01. We count the
number of simulation steps necessary to reach this state.
The results are shown in Figure 17.
We can observe that the number of steps to reach the
equilibrium is higher for unmoving agents case than for
moving agents one: moving agents tend to meet more other
agents and this mix fasten the emotion convergence. This
mix has a huge impact when agtInflCoeff i(j) (for every
j ∈ Ni(t)) is low, but decrease when the parameter value
increases.
180 Informatica 41 (2017) 169–182 X.-H. Ta et al.
(a) Unmoving agents in case of changing emDecayCoeff i while
fireInflCoeff i = 0.1 and agtInflCoeff i(j) = 0.01 for every
j ∈ Ni(t).
(b) Unmoving agents in case of changing emDecayCoeff i while
fireInflCoeff i = 0.1 and agtInflCoeff i(j) = 0.08 for every j ∈
Ni(t).
(c) Moving agents in case of changing emDecayCoeff i while
fireInflCoeff i = 0.1 and agtInflCoeff i(j) = 0.01 for every
j ∈ Ni(t).
(d) Moving agents in case of changing emDecayCoeff i while
fireInflCoeff i = 0.1 and agtInflCoeff i(j) = 0.08 for every
j ∈ Ni(t).
Figure 16: Comparing moving and unmoving agent in case of changing 3 factors emDecayCoeff i, fireInflCoeff i and
agtInflCoeff i(j)
Figure 17: Relationship between agtInflCoeff i(j) (for ev-
ery j ∈ Ni(t)) and time in the case where all the agents
reach to the equivalent emotion.
5 Conclusion and future works
In this article we proposed a model of fear level dynam-
ics based on some main findings from social psychology.
Our aim here is to provide an intuitive formalization of the
computational process for emotion modeling.
The model was implemented in GAMA agent-based
simulation platform. We conducted an intensive experi-
ments to find the equivalent value of three coefficients that
have impact on the emotion intensity of agent group. We
presented our results about the impact of decay, environ-
ment, and agents neighbors factors (i.e. emotional conta-
gion) on emotion intensity.
We shown how emotion evolves over time and the role
played by each variable of the simulation by using several
scenarios. In particular, the impact of the environment (in
case of the fire perception) has a great influence on the max-
imum fear level, whereas the emotional contagion tends to
bring closer emotions in the agent population.
Although this paper context is about crisis situation and
evacuation, the study remains abstract: the purpose of this
article is mainly to focus on the emotion dynamics model
and its exploration.
The next step will be to integrate this emotional frame-
work into a simulation of evacuation in crisis situation.
Emotions will be used at several steps: physical properties
of agents (strong emotions can make people move faster
or slower), decision-making process (it is now established
that emotions help to make decisions and often fasten the
decision-making process with a risk of making less effi-
Emotional Contagion Model for Group. . . Informatica 41 (2017) 169–182 181
cient decisions), and social process (in particular the group
constitution and the effects of the group on the group mem-
bers). The main objective will be to provide more realis-
tic evacuation simulations, in terms of human behaviors,
and thus to reach decision-support systems to support cri-
sis managers. We thus attempt to make simulations more
realistic by improving the human agents behaviors (in line
with [1, 6]).
More particularly, two application cases can be very in-
teresting. First it could help architects and urban planners
to better design public spaces to help people to better evac-
uate taking into account cognitive attitudes such as emo-
tions or social binds and not only simple physical flow
of individuals. Second we plan to apply this framework
on the cases study of Australian bushfires simulations [2].
This case of bushfires has killed hundreds of people and
has been deeply studied, in particular through interview of
most of the survivors. An important conclusion of this sur-
vey was that civilians have not reacted and acted as ex-
pected by authorities in charge of the preparedness against
fires and rescue to victims. First models of the evacua-
tion has been implemented, with a focus on the distinction
between objective and subjective civilian capabilities and
perception of the environment. We argue that it could be
improved by introducing emotional capabilities that can in-
fluence these biases in the representation of the world.
6 Acknowledgments
This work is funded by the research project at Vietnam Na-
tional University in Hanoi, number QG.15.31, on the mod-
eling and simulation of fire evacuation in public buildings.
We also acknowledge the referees of SoICT’2016 for
their constructive remarks.
References
[1] C. Adam and B. Gaudou. BDI agents in social sim-
ulations: a survey. Knowledge Engineering Review
(KER), 31(3):207–238, 2016.
[2] C. Adam and B. Gaudou. Modelling Human Be-
haviours in Disasters from Interviews: application to
Melbourne bushfires. Journal of Artificial Societies
and Social Simulation (JASSS), (To appear), 2017.
[3] J. R. Anderson and C. Lebiere. The Atomic Com-
ponents of Thought. Lawrence Erlbaum Associates,
Mahwah, NJ, 1998.
[4] T. Bosse, R. Duell, Z. A. Memon, J. Treur, and C. N.
Van Der Wal. Multi-agent model for mutual absorp-
tion of emotions. ECMS, 2009:212–218, 2009.
[5] T. Bosse, R. Duell, Z. A. Memon, J. Treur, and C. N.
van der Wal. Agent-based modeling of emotion con-
tagion in groups. Cognitive Computation, 7(1):111–
136, 2015.
[6] P. Caillou, B. Gaudou, A. Grignard, Q. C. Truong,
P. Taillandier. A Simple-to-Use BDI Architecture for
Agent-Based Modeling and Simulation. ESSA, 15-28,
2015.
[7] M. de Groot. Reaching a consensus. Journal of
the American Statistical Association, 69(345):118—
-121, 1974.
[8] G. Deffuant, D. Neau, F. Amblard, and G. Weisbuch.
Mixing beliefs among interacting agents. Advances
in Complex Systems, 03(01–04):87–98, 2000.
[9] J. Drury and C. Cocking. The mass psychology of
disasters and emergency evacuations: A research re-
port and implications for practice. Research report,
University of Sussex, 2007.
[10] J. Drury, C. Cocking, and S. Reicher. Everyone for
themselves? a comparative study of crowd solidarity
among emergency survivors. British Journal of Social
Psychology, 48:487–506, 2009.
[11] J. Drury, C. Cocking, and S. Reicher. The nature of
collective resilience: Survivor reactions to the 2005
london bombings. International Journal of Mass
Emergencies and Disasters, 27(1):66–95, 2009.
[12] J. Elster. Alchemies of the Mind: Rationality and the
Emotions. Cambridge University Press, 1999.
[13] L. Fu, W. Song, W. Lv, and S. Lo. Simulation of emo-
tional contagion using modified sir model: A cellular
automaton approach. Physica A: Statistical Mechan-
ics and its Applications, 405:380–391, 2014.
[14] P. Gantt and R. Gantt. Disaster psychology dispelling
the myths of panic. Emergency Planning, 2012.
[15] U. Grandi, E. Lorini, A. Novaro, and L. Perrus-
sel. Strategic disclosure of opinions on a social net-
work. In Proceedings of the 16th International Con-
ference on Autonomous Agents and Multiagent Sys-
tems (AAMAS-2017), 2017.
[16] M. Granovetter. Threshold models of collective be-
havior. American Journal of Sociology, 83(6):1420—
-1443, 1978.
[17] A. Grignard, P. Taillandier, B. Gaudou, D. A. Vo,
N. Q. Huynh, and A. Drogoul. Gama 1.6: Advancing
the art of complex agent-based modeling and simu-
lation. In PRIMA 2013: Principles and Practice of
MAS, pages 117–131. Springer, 2013.
[18] R. Hegselmann and U. Krause. Opinion dynamics
and bounded confidence models, analysis, and simu-
lations. Journal of Artificial Societies and Social Sim-
ulation, 5(3), 2002.
182 Informatica 41 (2017) 169–182 X.-H. Ta et al.
[19] D. Kempe, J. Kleinberg, and E. Tardos. Maximiz-
ing the spread of influence through a social network.
In Proceedings of the Ninth ACM SIGKDD Interna-
tional Conference on Knowledge Discovery and Data
Mining, 2003.
[20] D. Kempe, J. Kleinberg, and E. Tardos. Influen-
tial nodes in a diffusion model for social networks.
In Proceedings of the 32nd International Colloquium
on Automata, Languages and Programming (ICALP-
2005), 2005.
[21] R. S. Lazarus. Emotion and Adaptation. Oxford Uni-
versity Press, 1991.
[22] V. M. Le, C. Adam, R. Canal, B. Gaudou, T. V. Ho,
and P. Taillandier. Simulation of the emotion dynam-
ics in a group of agents in an evacuation situation. In
N. Desai, A. Liu, and M. Winikoff, editors, Principles
and Practice of MAS, volume 7057 of LNCS, pages
604–619. Springer, 2012.
[23] J. Leach. Why people ’freeze’ in an emer-
gency: Temporal and cognitive constraints on sur-
vival responses. Aviation, space, and environmental
medicine, 75(6):539–542, 2004.
[24] H. Mark, T. Jan, v. d. W. C. Natalie, and v. W. Arlette.
Modelling the interplay of emotions, beliefs and in-
tentions within collective decision making based on
insights from social neuroscience. In International
Conference on Neural Information Processing, pages
196–206. Springer, 2010.
[25] V. T. Nguyen, D. Longin, T. V. Ho, and B. Gau-
dou. Integration of emotion in evacuation simulation.
In C. Hanachi, F. Bénaben, and F. Charoy, editors,
Information Systems for Crisis Response and Man-
agement in Mediterranean Countries, volume 196 of
Lecture Notes in Business Information Processing,
pages 192–205. Springer, 2014.
[26] J. Norris R. Panic and the breakdown of social or-
der: Popular myth, social theory, empirical evidence.
University of Cincinnati, pages 171–183, 1987.
[27] A. Ortony, G. Clore, and A. Collins. The cognitive
structure of emotions. Cambridge University Press,
Cambridge, MA, 1988.
[28] E. Quarantelli. The sociology of panic. In Smelser
and Baltes, editors, International Encyclopedia of
the Social and Behavioural Sciences, pages 11020–
11023. Pergamon Press, New York, 2001.
[29] E. Quarantelli. The nature and condition of panic.
American Journal of Sociology, pages 267–275,
2010.
[30] T. Schelling. Micromotives and macrobehavior. Nor-
ton, 1978.
[31] K. R. Scherer, A. Schorr, and T. Johnstone, editors.
Appraisal Processes in Emotion : Theory, Methods,
Research. Oxford University Press, 2001.
[32] X. H. Ta, D. Longin, B. Gaudou, and T. V. Ho. Impact
of group on the evacuation process: theory and simu-
lation. In Proceedings of the Sixth International Sym-
posium on Information and Communication Technol-
ogy, pages 350–357. ACM, 2015.
[33] P. Taillandier, A. Grignard, B. Gaudou, and A. Dro-
goul. Des données géographiques à la simulation à
base d’agents : application de la plate-forme gama.
European Journal of Geography, 671:online, 2014.
Informatica 41 (2017) 183–192 183
Key-Value-Links: A New Data Model for Developing Efficient RDMA-Based
In-Memory Stores
Hai Duc Nguyen, The De Vu, Duc Hieu Nguyen, Minh Duc Le, Tien Hai Ho and Tran Vu Pham
Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, Vietnam
E-mail: ptvu@hcmut.edu.vn
Keywords: in-memory stores, key-value, RDMA, InfiniBand
Received: March 24, 2017
This paper proposes a new data model, named Key-Value-Links (KVL), to help in-memory store utilizes
RDMA efficiently. The KVL data model is essentially a key-value model with several extensions. This
model organizes data as a network of items in which items are connected to each other through links. Each
link is a pointer to the address of linked item and is embedded into the item establishing this link. Organiz-
ing datasets using the KVL model enables applications making use RDMA-Reads to directly fetch items
at the server at very high speed. Since link chasing bypasses the CPU at the server side, this operation
allows the client to read items at extremely low latency and reduces much workload at data nodes. Further-
more, our model well fits many real-life applications ranging from graph exploration and map matching
to dynamic web page creation. We also developed an in-memory store utilizing the KVL model named
KELI. The results of experiments on real-life workload indicate that KELI, without being applied much
optimization, easily outperform Memcached, a popular in-memory key-value store, in many cases.
Povzetek: Predlagan je nov podatkovni model, imenovan Key-Value-Links (povezave ključnih vrednosti).
1 Introduction
In-memory stores have flourished in recent years owing to
the urgent needs of fast processing and decreasing DRAM
prices. Many system designers have either used main mem-
ory as a primary data store [17] or as a cache to reduce the
latency of accessing hot or latency-sensitive items [2]. Be-
ing moved to main memory enables data to be accessed at
very low latency because it removes the overhead of disk
and flash. But it does not mean I/O overhead is absolutely
eliminated. Because of DRAM’s low capacity, in-memory
stores often deploys across multiple data nodes making net-
work I/O become a potential source of overhead. Indeed,
the traditional TCP/IP networks have shown many disad-
vantages in supporting fast data transmission. For exam-
ple, MemC3, a state-of-the-art in-memory store, runs seven
times better on a single machine than in a client-server
setup using TCP/IP [9, 8].
To solve this problem, several data centers started look-
ing for alternative solutions. Among those, Remote Di-
rect Memory Access (RDMA) is appeared to be the most
promising candidate. RDMA allows applications to di-
rectly read from and write to remote memory without in-
volving the Operating System at any host. This ability
helps RDMA achieve low latency and high throughput data
transmission because it bypasses the overhead of complex
protocol stacks, avoids buffer copying, and reduces CPU
overhead. Despite attractive features, RDMA has not been
widely used in data centers due to the high prices of its
supporting NICs. In recent years, however, the prices of
RDMA-enabled NICs have dramatically dropped and be-
come compatible with that of traditional Ethernet NICs.
For examples, A 40 Gbps InfiniBand RDMA-capable NIC
costs around $500, while the prices of a 10GB Ethernet
NICs may be up to $800 [15]. New standards such as
iWARP and RoCE also support RDMA allowing data cen-
ters to utilize RDMA with reasonable cost.
Within this trend, there are many studies have started to
leverage RDMA technology to build ultra-low latency in-
memory store. Those works indicate that much of effort
have to be spent in order to maximize the benefits of using
this technology for in-memory systems. Works to be done
including reduce NIC’s cache miss rate [8], minimizing the
number of RDMA operations per requests [15, 8], and op-
timizing hash table organization [8, 15, 12], etc. In spite of
implementation differences, most of existing RDMA-based
in-memory stores are constructed according to key-value
model since this model is very simple and well fit large and
unstructured datasets.
The key-value model, however, has its own drawbacks.
The most noticeable one is performance. Traditionally, ev-
ery put and get operation involves to hash table lookup to
identify the existence of items. This makes the hash table
become the hotspot of data access and it is not surprising
that most of the key-value stores spend much of effort tun-
ing their hashing mechanism [15, 8, 9]. Real-life workloads
indicate that key-value items are typically small [3] so em-
ploying key-value model could be easily suffered from low
184 Informatica 41 (2017) 183–192 H.D. Nguyen et al.
network utilization. Furthermore, using key-value model
often has applications to divide its requests into multiple
small item lookups. Those lookups often have to be exe-
cuted sequentially due to data dependency. This causes the
hash table lookup overhead and low bandwidth utilization
of multiple lookups to accumulate and reasonably prolong
the latency of the original request. According to [17], Face-
book creates about 130 internal requests in average for gen-
erating the HTML for a page. Similarly, Amazon requires
about 100-200 requests to create HTML part for each page
[7]. With those workloads, in-memory stores have to react
very quickly to each request to guarantee desired perfor-
mance. In the future, as the amount of data and workload
keeps increasing rapidly, it is difficult for the in-memory
stores to maintain its performance without changing re-
quest processing mechanisms.
In this paper, we introduce a novel data model named
Key-Value-Links (KVL) to enable in-memory stores to ex-
ploit RDMA efficiently to deliver ultra-low latency data
services. Essentially, the KVL model is a variant of the
Key-Value model that maintains links between items to ex-
ploit the data dependency between them to accelerate data
retrieval. A link contains the information about the (phys-
ical) location and the size of referred item so applications
could utilize RDMA Reads to directly fetch the item with-
out invoking expensive item lookups. This design bypasses
the hash table and efficiently utilizes the network as we
need only one RDMA Read for reading an item. As a re-
sult, getting desired items by chasing their links reduces
the cumulative latency of processing multiple item requests
significantly. We also introduce KELI (KEy-value-with-
Links In-memory), an in-memory cache that employing
KVL data model, and compare it with an in-memory key-
value cache (Memcached) to reveal the performance ben-
efits of utilizing the KVL model over RDMA-capable net-
works.
The following section briefly introduces RDMA tech-
nologies and recent works in developing in-memory stores
using RDMA. Section 3 discusses the KVL model in de-
tail. Several classes of applications which could utilize the
model efficiently are listed in Section 4. The section 5 dis-
cusses the design of KELI. We conduct several experiments
on real-life data to evaluate the efficiency of using KELI
and report their results in Section 6. Finally, we conclude
the paper in Section 7.
2 Background and related work
2.1 Remote direct memory access
Remote Direct Memory Access (RDMA) allows remote
computers to directly read memory regions on local mem-
ory without interfering its CPU. This allows zero-copy
data transfers and saving computing resource. Further-
more, RDMA-enabled NICs provide kernel bypass for
all communications and reliable delivery to applications.
These make the typical latency of interconnects support-
ing RDMA such as InfiniBand, RoCE and iWARP about
10x faster than traditional Ethernet [12]. RDMA-enabled
NICs is originally designed for High-performance Comput-
ing centers but due to decreasing in hardware prices, their
presence in data centers is increasing [15]. The introduc-
tion of RoCE and iWARP, which lets RDMA to be per-
formed over traditional network architecture, even makes
RDMA more popular.
Applications utilize RDMA-enabled NICs through Verb
API. There are several types of verbs but the most com-
mon are RDMA Read, RDMA Write, Send and Receive.
These verbs could be grouped into two types of seman-
tics: channel semantics and memory semantics. Send and
Receive have channel semantics: to send a message, the
sender posts a Send description to put the message content
to a remote memory location specified by a pre-posted Re-
ceive description at the receiver side. Send and Receive are
two-sided verbs as the communication involves the CPU
of both end points. RDMA Read and RDMA Write have
memory semantics: they operate directly upon the remote
memory regions. Both of them are one-sided as the remote
CPU does not aware those operations. This reduces not
only the overhead of RDMA operations but also the load of
remote CPU. Therefore, utilizing one-sided RDMA verbs
could achieve very low latency and high throughput.
2.2 In-Memory stores using RDMA
Attractive features of RDMA verbs have exposed many
studies of utilizing RDMA technology to build high-
performance in-memory stores. As communication is the
major source of overhead, previous work tries to replace
traditional data transfer techniques by RDMA operations to
effectively reduce the overall latency. For examples, Jithin
Jose et al. [11] improves Memcached performance by the
factor of four by just making it RDMA capable. In the
later work [10], the research group uses a hybrid approach
that utilizes both Reliable Connection (RC) and Unreliable
Connection (UC) transport and transparently switches be-
tween them to further improve the performance by the fac-
tor of 12.
Apart from communication, recent studies have started
to apply multiple optimization techniques on other parts
of the system to reach even better performance. HERD
[12, 13], makes heavy changes ranging from reducing net-
work round trips and reorganizing data distribution to op-
timizing PICe transactions. It even sacrifices the reliabil-
ity to maximize the performance. RDMA is also com-
bined with other technologies to develop complicated in-
memory stores. DrTM [23] and DrTM+R [6] are two fast
in-memory transaction systems utilizing both RDMA and
hardware transactional memory (HTM).
Different from traditional designs, Pilaf [15], FaRM
[8] and HydraDB [22] let applications process request by
themselves through RDMA Reads. In these systems, the
server makes data visible to clients so the client could use
RDMA Reads to access hash table and items at the re-
Key-Value-Links: A New Data Model for. . . Informatica 41 (2017) 183–192 185
mote server as if they are on its own local memory. This
approach bypasses many sources of overhead and reduces
load at data servers but there are shortcomings preventing
those systems from maximizing the potential of RDMA
Read. For examples, Pilaf clients have to carry out multiple
RDMA Reads per request. FaRM often performs RDMA
Reads to get memory blocks which are much larger than
the actual size of needed item. Also, bypassing remote
CPU makes it unaware about application behaviors which
are useful for tracking popular items.
Despite differences in implementation, all studies men-
tioned above employ the key-value model. Although this
model is quite simple and easy to implement, the lack of
the ability to represent complex data force applications us-
ing this model to generate a lot of item lookups for each
data request. This disadvantage makes the key-value model
sensitive to latency. This is the motivation for us to develop
Key-Value-Links model to solve this issue.
3 Key-Value-Link data model
3.1 Example
Before describing the Key-Value-Links (KVL) model in
detail, let us first show how it “looks and feels” through an
example of using this model to represent a real-life dataset
and handle data requests. Suppose we have a database stor-
ing the information about students, professors, and depart-
ments in a university. Figure 1 illustrates how the KVL
is used to organize this database. In this representation,
each entity (i.e. student, professor, and department) is a
key-value item. The key is unique and is used to iden-
tify the item. The database has five entities: two students
“stu001" and “stu002" under supervision of two profes-
sors “prof001" and “prof002" working in the department
‘‘dpcs". The value of each of those items contains multi-
ple attributes representing information associated with the
item. The item “stu002", for instance, has three attributes
in its value. The first one (e.g. “name : DEF") shows
the student’s name while the other are links indicating his
supervisor and mentor. Those links do not provide infor-
mation about those people but instead point to items stor-
ing information about them. In implementation, those links
could be represented as pointers which let applications di-
rectly access linked item without sending a request to the
data server.
Storing links to other items inside the item’s value makes
it easier to reason useful information which requires us to
combine the data from multiple sources. In the university
database, for example, suppose that user wants to know
whether two students "stu001" and "stu002" are under the
supervision of professors working in the same department.
To answer this question, we must first get access to the two
student items using their keys. After that, we travel across
the supervisor link of those items to obtain the informa-
tion about the supervisors. We then use department links
to go to their departments. Finally, we check if those de-
supervisor
stu001
name: ABY
tel: 123456789
website: www.cs.ac.co
depcs
name: XYZ
supervisor
stu002
name: DEF
mentor
department
prof001
name: UVT
department
prof002
name: IJK
Figure 1: An example of representing a dataset from a uni-
versity using the KVL data model. The key of each item is
shown in bold text, the links are shown in shadow frames,
and the arrows used to represent the link from one item to
another.
partments are the same to provide the final answer to the
question.
If this database is represented by the relational model,
answering this question requires us to perform multiple
cumbersome joins. Because such operation consumes a lot
of time and resource, using a relational database in such
case does not guarantee an acceptable performance. If we
use traditional key-value model, the applications must de-
compose the request into multiple item lookups. Since
most of the lookups depend on the results of the previ-
ous ones, applications have to perform them sequentially.
If the number of lookups is large, the cumulative latency,
which is calculated by adding up the latency of each item
lookups, will become very high and reasonably hurts the
overall latency of the original request. Most of the item
lookups could be replaced by one link chasing if we use
KVL model. As we will show in the next subsection, the
former operation is much more expensive than the latter so
using key-value model also takes more time to process the
request than applying the KVL model.
3.2 Data model
Generally, KVL is an enhanced version of the traditional
key-value model. In this model, each item is a key-value
pair connected to each other through links. Inside an item,
the key is its identifier while the value describes its charac-
teristics. There is no restriction on the size of either key or
value. Different from some implementations of key-value
model used in RAMCloud [17] or Memcached [2], KVL
model cares about the structure of value. Particularly, the
value is a set of attributes in < K,V > format where K is
the name of the attributes and V is its value. The value V
could be either a block of bytes representing some kind of
information defined by the user or a link to another item.
The concept of links is similar to that of pointers. Both
of them let applications know the location of the resource
but do not provide the information about its content and as-
sociated data. There are several benefits of this approach.
Embedded an item into another one enlarges data size sig-
186 Informatica 41 (2017) 183–192 H.D. Nguyen et al.
Hash Table
A
Memory
B
C
D
E
C E
Application
in-memory store
Figure 2: An example implementation of KVL model
based on the organization of existing key-value stores.
nificantly which would reduce memory utilization as well
as slow down data transmission. This also allows an item to
have multiple copies which could be a nightmare for main-
taining consistency. Furthermore, with the support from
RDMA Reads, pointing to referred item through its address
lets applications directly fetch the item without interfering
the data server. Utilizing RDMA Reads helps fetching item
at ultra-low latency as it bypasses many sources of over-
head such as notifying remote CPU and hash table lookup.
It also allows the system scale easily as the remote machine
could save many CPU cycles for other tasks.
Figure 2 illustrates the organization of an in-memory
store implementing the KVL model based on the funda-
mental structure of existing in-memory key-value stores.
Basically, KVL model is also a key-value model so meth-
ods it uses to handle data are similar to those of key-value.
In particular, the store constructs a hash table to keep track
of items stored in the system. Putting a new item to and get-
ting an existing one from the store requires the key of this
item to be hashed to the hash table first to determine the
proper action. Clearly, operations in a key-value store are
all related to the hash table making it become the hotspot
of the system.
The introduction of links leads to a new way to get data
from in-memory stores called link chasing to reduce the
load on the hash table. In this method, applications use
links attached to items it has fetched previously to invoke
RDMA Reads to directly retrieve the linked items from
the in-memory stores without explicitly sending a get re-
quest. For examples, in the Figure 2, the application has
performed two lookups to load the item B and item C from
the server. It has two options to load item A. It could gen-
erate a get request containing the key of item A and send
it to the server to have it search for this item. The another
choice is to use RDMA Read to chase the link to item A
which is integrated to the item C to read this item directly
without asking for the server.
Figure 3 compares the latency of link chasing using
RDMA Read with that of item lookup on different item
32 128 512 2048
Size (Byte)
0
4
8
12
Ti
m
e 
(M
ic
ro
se
co
nd
)
RDMA Read
HERD
Figure 3: The latency of getting items with different sizes
using RDMA Read and HERD.
sizes. The implementation of item lookup is based on the
method used in HERD [12], which is one of the fastest in-
memory key-value stores in the literature. It is clear that
even though being heavily optimized by many techniques,
item lookup still runs much slower than RDMA Read. This
means if we could organize items needed by applications in
a way such that from one item we could reach to other ones
by just chasing links, the latency could be reduced up to
50%. Therefore, using KVL data model with good data
schema design could significantly boost the system perfor-
mance without spending much effort on optimizing the in-
memory store implementation.
4 Applications
Apart from the simple university example in the previous
section, we found that the KVL is also applicable to a wide
range of applications. The followings are a few of them.
4.1 Graph exploration
Graph exploration is required by many data-intensive ap-
plications [16]. Graph traversal algorithms such as breadth-
first search (BFS) and depth-first search (DFS) are used as
basic components in various complicated algorithms which
are used to solve problems in many fields including biol-
ogy, communication, social network, etc. In the Big Data
area, a graph could contain up to trillions of nodes and
edges. Hence, traverse such large-scale graphs efficiently
is critical.
A graph contains only nodes and edges but in real-world
applications, both nodes and edges are associated with a
lot of information. This makes representing the topology
of the graph in the computer a nontrivial task especially
in the case of large graphs which could expand over mul-
tiple data nodes. Modeling graphs using relational model
or XML does not scale well. Plus, those tools do not ease
graph traversals. Many state-of-the-art in-memory graph
Key-Value-Links: A New Data Model for. . . Informatica 41 (2017) 183–192 187
databases are constructed upon key-value model [4, 21]
due to its simplicity. However, deploying graph traversal
algorithms using this model would lead to high cumulative
latency.
The KVL model, on the other hand, is very similar to
the concept of Graph database since this model itself is
a network of items. In fact, it could be considered as a
“lightweight” graph in which items are vertices and links
are edges. We use the term “lightweight” because there
are some limitations that prevent KVL from naturally rep-
resenting complicated graphs. For examples, information
cannot embed into links and a link must point to a physical
address rather than abstract objects. Due to those short-
comings, using links to represent edges in complex graphs
could increase the management costs reasonably. In spite
of those, KVL is appeared to be well fit to graph traversal
algorithms. With link chasing, applications could avoid a
lot of overhead during visiting vertices.
4.2 Dynamic web content creation
The rapid increment of the amount of data and the need of
improving user experiments make the number of dynamic
web pages increase at a high pace. One well-known so-
lution for efficiently delivering dynamic content is to de-
compose the pages into small fragments and cache those in
main memory. Additionally, an object dependence graph
(ODG) is constructed to keep track changes and maintain
consistency [5, 19]. When a new web page is requested,
the ODG is checked to reload fragments whose content
has been changed and directly fetch those whose content
remains unchanged from the cache. During this process,
fragments are issued sequentially since the latter fragment
depends on the earlier one. With such kind of access pat-
tern, using the key-value model implemented in popular
in-memory caches to store the fragments and ODG could
lead to high cumulative overhead when creating a page.
The KVL model is well fit for caching such kind of
dataset since it is also a graph in nature. By representing
items as fragments and use links to formulate the depen-
dency between fragments, applications could construct the
web page by simply chasing links between pages’ compo-
nents. Since link chasing is much faster than item lookup,
applying this model would reduce the cumulative latency
significantly.
4.3 Intelligent transportation systems
Intelligent Transportation Systems (ITS) act important
roles in solving critical issues in urban areas such as con-
gestion, air pollution, and safety of transit. The major
problem of ITS system is that they have to manage a huge
amount of data which is pushed to the system continuously
from many sources like GPS, video stream, etc., in order
to produce meaningful information in real-time. To do so,
the digital map must be well organized since most of the
critical operations such as map matching, routing, and con-
gestion detection relies on it. As the map could be consid-
ered as a network of points (e.g. intersections) and lines
(e.g. streets), KVL model is a promising candidate for rep-
resenting its content in ITS systems.
5 KELI: A KVL In-memory Store
We have implemented an in-memory store utilizing the
KVL model named KELI (stands for KEy-value-with-Link
In-memory Store). Originally, we developed KELI while
constructing a traffic condition monitoring system for Ho
Chi Minh City (available at traffic.hcmut.edu.vn). The main
role of KELI is to manage the metadata of the city map so
that applications could quickly process GPS signals gener-
ated by vehicles to produce meaningful information about
the current traffic condition of the city [14]. Although
KELI is originally designed for an ITS system, we feel its
architecture is general enough for working with other ap-
plications. So we extended its implementation to make it
applicable to a wide range of applications.
5.1 System architecture
Our objective when designing KELI is to provide a light-
weight in-memory store for hardly-changed datasets stored
in complicated (disk-based) databases. Particularly, KELI
copies items stored in the database to memory and lets ap-
plications to access them through its interface instead of
sending requests directly to the database. The design of
KELI also assumes that update occurs very rarely and a few
changes do not cause serious impact on application perfor-
mance and correctness.
Figure 2 illustrates the overall architecture of KELI. Data
is originally stored permanently on disk to ensure durability
and availability. KELI is deployed entirely in memory. Af-
ter starting up, KELI accesses data on disk and loads them
into memory. During this process, items are transformed
from their original format on disk to KVL format. After
KELI finishes loading data from disk, data access could be
redirected to KELI and the database now acts as a backup
module.
KELI does not support update operations (i.e. mod-
ify, write, and delete) so if application want to change the
content of data, it still has to send those requests to the
database. Updates occurring at disk do not take effect im-
mediately to the in-memory store. KELI, however, reloads
the content of data on disk after predefined and fixed inter-
vals.
5.2 Data layout
Since DRAM capacity is much smaller than that of sec-
ondary storage, utilizing memory space efficiently is a cru-
cial requirement. To do so, avoiding/reducing fragmenta-
tion is necessary as this is the primary source of low mem-
ory utilization. S. M. Rumble et al. [20] showed that cur-
rent standard dynamic memory allocators such as “malloc"
188 Informatica 41 (2017) 183–192 H.D. Nguyen et al.
A
B
C
D
E
A B C E D
Hash Table
Pre-allocated Memory Region
Memory
... ...
Listener
Worker 
thread 
pool
...
A
1. Get(C)
2. Thread-handoff
3. Lookup
C
4. Data
tramsmit
Application
5. Read(CA)
Data Node
Figure 4: Request Processing
in C do not handle this problem well. Therefore, to avoid
fragmentation, we do not utilize dynamic memory alloca-
tors to create rooms for data. Free space is instead reserved
in advance in form of contiguous memory slots. KELI dra-
matically fills them up with the content of new items.
KELI updates its content periodically in batch-style. Ev-
ery time the update process is triggered, KELI first allo-
cates new memory regions for new items then fills them up
with the content of data stored in the disk-based database.
After that, it deallocates the memory regions of old data
and uses items in the new memory regions for answering
upcoming requests from applications. As the clients by-
passes the server when chasing links, KELI must ensure
applications do not access old items after the update took
place by halting all active connections from the clients and
have the clients reestablish those connections to the server
to obtain the new content.
Similar to key-value stores, KELI employs a hash table
for tracking items by their key. We use the Cuckoo hash-
ing [18] to implement the hash table since this technique
ensures constant complexity in the worst case, guarantees
stable performance with large datasets. The hash table does
not hold the content of hashed items but it instead stores the
pointer to the actual data. So for each new item, KELI first
finds a slot in allocated memory regions for it and write its
content to this slot. After that, item’s key and the pointer to
the slot are added to the hash table.
follow_list
userABC
name: ABC
next
followABC_0
follows
followABC_1
follows
follow_list
userEDF
name: EDF
followEDF_0
follows
Figure 5: Modeling Twitter datasets by the KVL model.
Black boxes represent a list of links and each link refers to
one item in the dataset. Gray boxes represent a single link.
Some stores such as HBase [1] keep items in memory
in form of memory objects to simplify data management.
This approach, however, often requires the server and client
to serialize/deserialize those objects from/to an array of
bytes before transferring them over the network. Serial-
ization adds significant overhead to request processing es-
pecially in small ones like item lookups. Furthermore, if
items contain pointers to different resources, chasing links
would generate multiple RDMA Reads making the opera-
tor inefficient. Therefore, KELI servers store items in form
of arrays of bytes and let the client perform serialization.
5.3 Request processing
reviewer_list
productABC
name: ABC
Price: $100
...
product
Review00001
user
Text:  good 
rating: 4
user0Abf12
name: ABC
product
Review02112
user
Text:  OK 
rating: 2
user0xZf7
name: XYZ
(a) Fragments represented by KVL model.
Top Customer Reviews
 he is really smart and he loves it
By Hermann E. Wagner on October 5, 2016
Format: Hardcover
I have been buying this book for 3 years Now, it never disappoints. It is my usual birthday gift for my now 12 yr old nephew, he is really smart and he
loves it. It is the perfect gift for a curious mind!
4 people found this helpful
Excellent reporting and easy reading.
By Sandi on September 17, 2016
Format: Hardcover | Verified Purchase
This is a Christmas gift to my son who has collected these books for years. He'll love it. Easy reading and pictures are lovely.
4 people found this helpful
Product
Guinness World Records 2017 Hardcover – August 30, 2016
by Guinness World Records (Author), Chris Hadfield (Introduction), Buzz Aldrin (Contributor)
Price: $21.67
Description
The ultimate annual book of records is back and crammed with more than ever before! Guinness
World Records 2017 is bursting with all-new records on topics as diverse as black holes, domes,
owls, and killer plants. Want to know the highest anyone has travelled on a skateboard, or the largest
loop-the-loop completed in a car? Dying to know just how many tricks a cat can do in one minute?
The answers to these questions and so much more are right inside. Read more
Customer Reviews
Product
Review
Review
(b) Web Page content.
Figure 6: Modeling a dynamic web page by the KVL
model. Black boxes represent a list of links and each link
refers to one item in the dataset. Gray boxes represent a
single link.
Key-Value-Links: A New Data Model for. . . Informatica 41 (2017) 183–192 189
In this subsection, we will show how KELI handle re-
quests from clients. The whole process is shown in Figure
4. KELI communication modules are built upon IB verb
programming model. Given a key, the client asks for its
value by issuing a “get" request via “ib_send". The re-
quest is received at the server side by a listener which has
responsibility for receiving any incoming requests. In order
to maximize KELI performance, the listener continuously
asks input queue for new requests instead of passively wait-
ing for the queue to inform it about the new message like
traditional techniques. Although this approach wastes a lot
of CPU cycle for polling input queue, it makes KELI re-
spond to the new request very quickly.
When the listener discovers a new request in the in-
put queue, it then pops the request out and forwards it to
a worker thread in the thread pool. Threads are chosen
randomly to ensure load-balancing. After receiving a re-
quest from the listener, chosen thread then searches for the
needed item in the hash table. If the item is not found,
it generates a response with empty payload and sends it
back to the client using “ib_recv” operation. Otherwise,
the hash table would return a pointer to the location of the
item. Thread just simply follows the pointer, generates
a non-empty response message, copies the content of the
item to the payload of this message and sends the response
back to the client (also using “ib_recv”).
The client has responsibility for interpreting the mean-
ing of the payload of the response message. If the item
contains links to other items and application wants to re-
trieve them, the client does not make another “get” request
but using RDMA Read to directly read the content of the
linked items from the server. Doing so significantly re-
duces item loading latency since executing RDMA Reads
is much cheaper than explicitly invoking an item lookup
request (e.g. “get”).
6 Experiments
6.1 Experiment setup
In this section, we will illustrate the benefits of employ-
ing the KVL model for RDMA-based in-memory stores
by comparing the performance of KELI with another in-
memory key-value store. We choose Memcached for this
task due to its popularity. In fact, to make the compari-
son fair, instead of using the original version, we make use
of an extended version of Memcached, which uses RDMA
verbs for data transmission, for all experiments. [11, 10]
The two stores are compared based on practical applica-
tions. Particularly, we use the KVL model to represent sev-
eral real-life datasets and let KELI manage them. We do
the same task with Memcached except that links in items
are replaced by the key of referred items. We then develop
some applications implementing popular algorithms work-
ing over those datasets. The data such applications need for
computation is fetched from either KELI or Memcached.
We measure the computation cost and use it to compare the
two stores.
6.2 Data modeling
We conduct experiments on three different real-life
datasets, each associated with one problem listed in Sec-
tion 4. In the text bellow, we will illustrate those datasets
and describe how to use the KVL model to model them.
For the key-value version, we just replace links by the key
of item it pointing to.
Social Network Graph traversal is very popular on the
social network. For examples, given a user, find a per-
son with a given name (e.g. “John”) among his friends,
his friends’ friends, and so on is a typical problem mak-
ing use of graph exploration. In the experiment, we will
perform the Breadth-First-Search (BFS) over a real-life so-
cial network dataset provided by Twitter. The dataset con-
tains about one million nodes represent users and more than
22 million edges represent the followership between users.
Figure 5 shows how the dataset is modeled by the KVL
model. Clearly, this representation is similar to adjacent
list data structure except for that edge (e.g. follower) lists
are broken into multiple chunks since one user may have
a lot of friends. If we integrate all of them into one item,
this could enlarge the size of this item reasonable leading
to performance degradation. In following experiments, we
let each list contain at most 100 followers.
Web Page Generation We construct a web page dis-
playing information about the reviews of products sold
by Amazon using the dataset provided by Amazon itself.
The content of the page is dynamic as product information
change frequently and users continuously update their re-
views to products. We have to break the HTML file into
multiple parts and change their content right after the up-
date takes effect. Figure 6a shows the relationship between
users, products, and reviews of users for some products and
Figure 6a shows how the HTML file of the page lock and
feel.
Map Matching We choose map matching problems as
a representative application for ITS systems. Given a GPS
signal, we have to determine if this signal belongs to any
street and if so, identify which place on the street it falling
into. This problem is very popular in ITS system involving
to real-time traffic monitoring, congesting detection, rout-
ing, etc.
In this experiment, we use a digital map provided by
OpenStreetMap (OSM) to construct the datasets about
streets in Ho Chi Minh City. Figure 7 illustrates an ex-
ample of modeling the map by the KVL model. Partic-
ularly, according to OSM’s format, a street is a polyline
which is constructed by connecting multiple nodes (points).
Since the street is a polyline and typically long, we do
not map GPS signals with streets but with lines which
are constructed by connecting two consecutive nodes on
a street called segment. An item represents a segment will
link to items containing the information about its endpoints
190 Informatica 41 (2017) 183–192 H.D. Nguyen et al.
nodeIJK
nodeUVT
nodePQO
streetDEF
cellABC
(a) Objects on the map.
segments
cellABC
segments
streetDEF
name: DEF
street
segXYZ
Endpoint0
cell
longitude
nodeIJK
latitude
latitude
nodeUVT
longitude
Endpoint1
street
segLMN
Endpoint0
cell
Endpoint1
latitude
nodePQO
longitude
(b) Objects in KVL model.
Figure 7: Modeling objects on digital map by the KVL model. Black boxes represent a list of links and each link refers
to one item in the dataset. Gray boxes represent a single link.
64 256 1024 4096
Size (Byte)
0
20
40
60
La
te
nc
y 
(M
ic
ro
se
co
nd
)
Look up
Link Chasing
RDMA-memcached
Figure 8: The latency of link chasing and item lookup in
experiments.
(node). There are also links from streets to segments con-
structed from their nodes. We group segments into disjoint
areas called cells based on their geographical location. The
map matching algorithm is quite simple: given a GPS sig-
nal, the application first determines its spatial information
(e.g. latitude and longitude) and uses them to identify the
corresponding cell. It then issues the in-memory stores for
this cell and then retrieves segments belonging to this cell
to find out which segment this signal belongs based on their
geographical locations.
6.3 Performance evaluation
We conduct all experiment on two computers equipped
with Intel Xeon E5-2670 and 32GB main memory. They
are connected through an Infiniband connection using
Mellanox’s ConnectX-3 40 Gbps NIC. The RDMA-
Memcached in all experiments is based on Memcached
version 1.4.24 and applications use libMemcached version
1.0.18 to communicate with the store. In order to fully un-
derstand the effect of using KVL model, let us first com-
5-th Average 95-th0
8
16
24
La
te
nc
y 
(M
ill
is
ec
on
d)
KELI
RDMA-Memcached
Figure 9: Map matching latency
pare the read performance of KELI’s item lookup and link
chasing with RDMA-Memcached’s get operation. Figure
8 shows the experiment results. Clearly, the naive imple-
mentation of lookup using Send/Recv verbs performs very
poorly. It takes about three to four times slower than the
optimized version used by RDMA-Memcached. However,
item lookup still executes two times longer than link chas-
ing. Therefore, if applications make good use of link ,
KELI could perform better than RDMA-Memcached.
Although KELI has to deserialize item content and check
for consistency when chasing links, link chasing latency is
just slightly slower to that of pure RDMA Read reported
in Figure 3. This is because the time spent on communica-
tion is the dominant cost of RDMA operators. So although
KELI has to check for consistency and deserialize every
item it reads, its latency is still lower than that of HERD.
Also note that HERD’s lockup latency could be higher in
practical as it sacrifices reliability and let applications take
care of integrity checks to boost the lookup performance as
much as possible.
In the map matching experiment, we preload both KELI
and RDMA-Memcached with about six million key-value
Key-Value-Links: A New Data Model for. . . Informatica 41 (2017) 183–192 191
5-th Average 95-th0
400
800
1200
1600
La
te
nc
y 
(M
ic
ro
se
co
nd
)
KELI
RDMA-Memcached
Figure 10: Web page construction latency
10 100 1000 10000
Number of vertices
0.1
0.5
1
23
La
te
nc
y 
(S
ec
on
d)
KELI
RDMA-Memcached
Figure 11: Graph exploration
pairs represent the geographical information of Ho Chi
Minh City. Similarly, we prepare about 200 thousand re-
views for more than 12 thousand products for web creation
application and a graph with one million nodes and about
22 million edges for BFS traversal. Figure 9, 10, and 11
show the execution time of map matching, web page con-
struction, and BFS algorithms, respectively, using KELI
and RDMA-Memcached.
Apparently, KELI outperforms RDMA-Memcached in
all cases. In the case of map matching, KELI outperforms
RDMA-Memcached by the factor of two in average. In
the case of tail latency (95-th percentile), KELI still runs
about 2.5 times faster than RDMA-Memcache. KELI also
helps applications construct web pages 50% faster than
RDMA does. Similarly, the implementation of BFS algo-
rithm using KELI runs 75% faster than that using RDMA-
Memcached.
The reason behind this is that according to the data lay-
outs we described in the previous section, applications uti-
lizing KELI mostly uses link chasing for fetching new
items. For example, in the case of graph traversal, the
application only has to invoke item lookup for the first
time when it has to retrieve the first vertex. After that,
based on the “list” and “next” links integrated into each
accessed vertex and edge lists, the applications could al-
ways invoke link chasing to get information about vertex
to be accessed. On the other hand, applications supported
by RDMA-Memcached have no choice but item lookup to
retrieve data. Since this operation is about two times lower
than link chasing, KELI performs two times better than
RDMA-Memcached.
7 Conclusion
In this paper, we present KVL, an enhanced version of
the key-value model for in-memory stores working over
RDMA-capable networks. In this model, each data set is
a network of key-value pairs linking to each other. Each
link is a pointer to the address of the referred item and
is integrated directly into the item. With this organiza-
tion, the KVL model introduces a new operation named
link chasing to allow applications to utilize RDMA Read
to directly read items through links without interfering the
data server. Our experiments have shown that this model is
well fit many real-life applications. Also, by utilizing this
model, KELI, an average in-memory store without much
optimization could easily outperform an state-of-the-art in-
memory store.
.
References
[1] Hbase. https://hbase.apache.org/. Accessed: 2017-
03-03.
[2] Memcached. https://memcached.org/. Accessed:
2016-11-07.
[3] Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song
Jiang, and Mike Paleczny. Workload Analysis of a
Large-Scale Key-Value Store. In ACM SIGMETRICS
Performance Evaluation Review, volume 40, pages
53–64. ACM, 2012.
[4] Nathan Bronson, Zach Amsden, George Cabrera,
Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris,
Anthony Giardullo, Sachin Kulkarni, Harry Li, et al.
Tao: Facebook’s Distributed Data Store for the Social
Graph. In Presented as part of the 2013 USENIX An-
nual Technical Conference (USENIX ATC 13), pages
49–60, 2013.
[5] Jim Challenger, Arun Iyengar, Karen Witting,
Cameron Ferstat, and Paul Reed. A Publishing Sys-
tem for Efficiently Creating Dynamic Web Content.
In INFOCOM 2000. Nineteenth Annual Joint Con-
ference of the IEEE Computer and Communications
Societies. Proceedings. IEEE, volume 2, pages 844–
853. IEEE, 2000.
192 Informatica 41 (2017) 183–192 H.D. Nguyen et al.
[6] Yanzhe Chen, Xingda Wei, Jiaxin Shi, Rong Chen,
and Haibo Chen. Fast and General Distributed Trans-
actions Using RDMA and HTM. In Proceedings of
the Eleventh European Conference on Computer Sys-
tems, page 26. ACM, 2016.
[7] Giuseppe DeCandia, Deniz Hastorun, Madan Jam-
pani, Gunavardhan Kakulapati, Avinash Lakshman,
Alex Pilchin, Swaminathan Sivasubramanian, Peter
Vosshall, and Werner Vogels. Dynamo: Amazon’s
Highly Available Key-Value Store. ACM SIGOPS
Operating Systems Review, 41(6):205–220, 2007.
[8] Aleksandar Dragojević, Dushyanth Narayanan,
Miguel Castro, and Orion Hodson. FaRM: Fast
Remote Memory. In 11th USENIX Symposium on
Networked Systems Design and Implementation
(NSDI 14), pages 401–414, 2014.
[9] Bin Fan, David G Andersen, and Michael Kaminsky.
MemC3: Compact and Concurrent Memcache with
Dumber Caching and Smarter Hashing. In Presented
as part of the 10th USENIX Symposium on Net-
worked Systems Design and Implementation (NSDI
13), pages 371–384, 2013.
[10] Jithin Jose, Hari Subramoni, Krishna Kandalla,
Md Wasi-ur Rahman, Hao Wang, Sundeep Narravula,
and Dhabaleswar K Panda. Scalable Memcached De-
sign for Infiniband Clusters Using Hybrid Transports.
In Cluster, Cloud and Grid Computing (CCGrid),
2012 12th IEEE/ACM International Symposium on,
pages 236–243. IEEE, 2012.
[11] Jithin Jose, Hari Subramoni, Miao Luo, Minjia
Zhang, Jian Huang, Md Wasi-ur Rahman, Nusrat S
Islam, Xiangyong Ouyang, Hao Wang, Sayantan Sur,
et al. Memcached Design on High Performance
RDMA Capable Interconnects. In 2011 International
Conference on Parallel Processing, pages 743–752.
IEEE, 2011.
[12] Anuj Kalia, Michael Kaminsky, and David G Ander-
sen. Using RDMA Efficiently for Key-Value Ser-
vices. In ACM SIGCOMM Computer Communication
Review, volume 44, pages 295–306. ACM, 2014.
[13] Anuj Kalia, Michael Kaminsky, and David G Ander-
sen. Design Guidelines for High Performance RDMA
Systems. In 2016 USENIX Annual Technical Confer-
ence (USENIX ATC 16), 2016.
[14] Minh Duc Le, The De Vu, Duc Hieu Nguyen,
Tien Hai Ho, Duc Hai Nguyen, Tran Vu Pham,
et al. Keli: a key-value-with-links in-memory store
for realtime applications. In Proceedings of the Sev-
enth Symposium on Information and Communication
Technology, pages 195–201. ACM, 2016.
[15] Christopher Mitchell, Yifeng Geng, and Jinyang Li.
Using One-Sided RDMA Reads to Build a Fast, CPU-
Efficient Key-Value Store. In Presented as part
of the 2013 USENIX Annual Technical Conference
(USENIX ATC 13), pages 103–114, 2013.
[16] Richard C Murphy, Kyle B Wheeler, Brian W Barrett,
and James A Ang. Introducing the Graph 500. Cray
User’s Group (CUG), 2010.
[17] John Ousterhout, Parag Agrawal, David Erickson,
Christos Kozyrakis, Jacob Leverich, David Mazières,
Subhasish Mitra, Aravind Narayanan, Guru Parulkar,
Mendel Rosenblum, et al. The Case for RAM-
Clouds: Scalable High-Performance Storage Entirely
in DRAM. ACM SIGOPS Operating Systems Review,
43(4):92–105, 2010.
[18] Rasmus Pagh and Flemming Friche Rodler. Cuckoo
hashing. In European Symposium on Algorithms,
pages 121–133. Springer, 2001.
[19] Lakshmish Ramaswamy, Arun Iyengar, Ling Liu, and
Fred Douglis. Automatic Detection of Fragments in
Dynamically Generated Web Pages. In Proceedings
of the 13th international conference on World Wide
Web, pages 443–454. ACM, 2004.
[20] Stephen M Rumble, Ankita Kejriwal, and John K
Ousterhout. Log-structured memory for dram-based
storage. In FAST, volume 14, pages 1–16, 2014.
[21] Bin Shao, Haixun Wang, and Yatao Li. Trinity: A
Distributed Graph Engine on a Memory Cloud. In
Proceedings of the 2013 ACM SIGMOD International
Conference on Management of Data, pages 505–516.
ACM, 2013.
[22] Yandong Wang, Li Zhang, Jian Tan, Min Li, Yuqing
Gao, Xavier Guerin, Xiaoqiao Meng, and Shicong
Meng. HydraDB: A Resilient RDMA-driven Key-
Value Middleware for in-Memory Cluster Comput-
ing. In Proceedings of the International Conference
for High Performance Computing, Networking, Stor-
age and Analysis, page 22. ACM, 2015.
[23] Xingda Wei, Jiaxin Shi, Yanzhe Chen, Rong Chen,
and Haibo Chen. Fast in-Memory Transaction Pro-
cessing Using RDMA and HTM. In Proceedings
of the 25th Symposium on Operating Systems Prin-
ciples, pages 87–104. ACM, 2015.
Informatica 41 (2017) 193–207 193
Defense Strategies against Byzantine Attacks in a Consensus-Based Network
Intrusion Detection System
Michel Toulouse, Hai Le and Cao Vien Phung
Vietnamese German University, Vietnam
E-mail: michel.toulouse, hai.lh@vgu.edu.vn, caovienphung@gmail.com
Denis Hock
Frankfurt University of Applied Sciences, Germany
E-mail: dehock@fb2.fra-uas.de
Keywords: network security, intrusion detection, consensus algorithms, Byzantine attacks
Received: March 29, 2017
The purpose of a Network Intrusion Detection System (NIDS) is to monitor network traffic such to detect
malicious usages of network facilities. NIDSs can also be part of the affected network facilities and be
the subject of attacks aiming at degrading their detection capabilities. The present paper investigates such
vulnerabilities in a recent consensus-based NIDS proposal [1]. This system uses an average consensus
algorithm to share information among the NIDS modules and to develop coordinated responses to network
intrusions. It is known however that consensus algorithms are not resilient to compromised nodes sharing
falsified information, i.e. they can be the target of Byzantine attacks. Our work proposes two different
strategies aiming at identifying compromised NIDS modules sharing falsified information. Also, a sim-
ple approach is proposed to isolate compromised modules, returning the NIDS into a non-compromised
state. Validations of the defense strategies are provided through several simulations of Distributed Denial
of Service attacks using the NSL-KDD data set. The efficiency of the proposed methods at identifying
compromised NIDS nodes and maintaining the accuracy of the NIDS is compared. The computational
cost for protecting the consensus-based NIDS against Byzantine attacks is evaluated. Finally we analyze
the behavior of the consensus-based NIDS once a compromised module has been isolated.
Povzetek: Sistemi za odkrivanje napadov v omrežjih temeljijo na pojavih nenavadnega prometa, vendar so
občutljivi na napade. Prispevek opisuje obrambo pred bizantinskimi napadi.
1 Introduction
Network intrusion detection systems are part of a vast ar-
ray of tools that protect computer infrastructures against
malicious activities. The specific task of NIDSs is to mon-
itor computer network infrastructures, seeking to identify
malicious intends through the analysis of network traffic.
Today’s computer networks are quite large, composed of
several heterogeneous sub-networks. Consequently, traffic
monitoring often needs to be done distributively with sen-
sors and traffic analysis modules placed at different strate-
gic locations, in charge of monitoring and analyzing the
traffic of a specific sub-network.
Usually, NIDS monitoring modules are connected by a
network, allowing security information collected about a
sub-network to be shared with other NIDS modules. Which
information is shared and how it is shared often character-
ize the organization of NIDSs as centralized, hierarchical
or distributed [2]. The monitoring modules of centralized
and hierarchical NIDS architectures, which can be limited
to simply collecting data, send their information up in the
hierarchy for further analysis. To the extend that analy-
sis and responses depend on a single or few modules in the
NIDS, these systems can be completely incapacitated by at-
tacks that target the more intelligent modules. A common
mitigation for these risks is to avoid a single point of fail-
ure by using distributed Intrusion Detection Systems [3, 4].
Modules in distributed intrusion detection systems are of-
ten full scale sensing and analytical devices. The modules
cooperate by sharing information to address attacks from
concurrent sources (such as distributed denial of service),
to develop network wide coordinated responses to attacks
or simply to increase the detection accuracy of each NIDS
module. Early distributed systems [5, 6], where also build
upon a master–slave architecture and require the data to be
sent to a central location for further analysis. Today, us-
ing peer-to-peer systems [7, 8, 9, 10], it is possible to rec-
ognize attacks by analyzing shared information in a fully
distributed manner.
While it is more difficult to completely disable dis-
tributed systems compared to centralized ones, modules of
a distributed system can still be the target of attacks aim-
ing to disable locally the system or to mask attacks in some
sub-networks to other nodes of the distributed NIDS. The
194 Informatica 41 (2017) 193–207 M. Toulouse et al.
present research addresses the vulnerability of a recently
proposed fully distributed NIDS [1]. This system uses an
average-consensus algorithm for computing network wide
security information that can then be used to recognize at-
tacks and activate coordinated responses to malignant ac-
tivities. However, it is well known that consensus algo-
rithms are not resilient to compromised nodes sharing fal-
sified information, i.e. they can be the target of Byzantine
attacks.
Consensus algorithms are based on peer-to-peer commu-
nications among neighbor nodes of a computer network
(no routing). They are distributed iterative algorithms in
which each node of the network repeatedly updates its cur-
rent value based on its own previous value and the previous
values of its neighbors in the network. The objective is to
reach a "consensus", i.e. each node computes a same out-
put that depends on initial values distributed across the net-
work while using only local updates. Repeating such local
computation, and given overlapping neighborhoods, a con-
sensus eventually emerges by diffusion of local updates.
Consensus algorithms have a long history in computer
science where they provide solutions to distributed com-
puting problems. For example, consensus algorithms solve
the leader election problem, where processes must select
one of them to coordinate tasks in a distributed system
[11]. Consensus algorithms have also found applications
or research interests in physics [12], process control [13],
robotic [14], operations research [15], services at IoT edge
nodes [16], not to mention its application in the controver-
sial bitcoin currency [17].
Average consensus refers to a particular form of consen-
sus where cooperative nodes compute the average sum of
their initial values. Average consensus algorithms also have
a wide range of applications, for example we find them
recently in wireless network applications such as cooper-
ative spectrum sensing in cognitive radio networks [18],
distributed detection in wireless networks [19], sensor net-
works [20].
Consensus algorithms vulnerabilities to sharing falsi-
fied information have been known for a long time. Orig-
inally, consensus algorithms solved the problem of reach-
ing agreement assuming a non-faulty non-adversarial com-
puting environment. In reality, links can fail, nodes can
stop transmitting data to neighbors (faulty links, nodes) or
nodes can transmit incorrect data, possibly falsified by an
adversarial actor (Byzantine nodes). The resilience of con-
sensus algorithms has been analyzed in the context of fault-
tolerant systems in Lamport, Pease and Shostack [21, 22].
The problem of reaching consensus in faulty and adversar-
ial environments became known as the Byzantine agree-
ment problem [22]. The problem asks under which condi-
tions consensus can be reached in the presence of Byzan-
tine faults. In [21], it is proved that resilient consensus
algorithms cannot be designed in a fully connected net-
work (complete graph) of n processors if the number m
of Byzantine nodes is 3m + 1 ≥ n. The Byzantine agree-
ment problem in [22] refers specifically to attacks in which
Byzantine nodes modify the initial values for which con-
sensus is computed (data falsification attacks). Since then,
the Byzantine agreement problem has been adapted to con-
sider new failure conditions, i.e. different attack models, as
well as quite diverse network settings.
Research studies aiming at detecting Byzantine nodes
are of a particular relevance to our work. In [23, 24], a tech-
nique based on the detection of outliers is applied to find
compromised nodes in a consensus-based spectrum sens-
ing algorithm for ad hoc wireless networks. In [24], several
attack models are proposed to subvert the spectrum sens-
ing algorithm. One attack model is a covert adaptive data
injection attack, which adjusts attack strategies by manipu-
lating the sensing results. The proposed defense consists to
isolate neighbor nodes that send numerical data that deviate
too much from some norm. In [25], the detection of Byzan-
tine nodes is derived from reputation-based trust manage-
ment strategies. In this paper, one type of attacks consists
of malicious robots injecting false data to neighbors in a
multirobot system controlled by a consensus algorithm for
the purpose of formation control. The proposed defense
system consists to decrease the consensus weight contri-
bution of a node that has its reputation drops during the
computation of consensus states. Defense strategies against
Byzantine attacks also originate from research in process
control and control theory, fields where one of the focus is
to provide methodological approaches to detect faulty com-
ponents in a system. In [26, 27, 28] different approaches
based on control theory are proposed to detect Byzantine
attacks on consensus algorithms. In [26, 27], using model-
based fault detection techniques, it is shown that if the net-
work of consensus nodes is 2k + 1 connected then up to k
Byzantine nodes can be identified. However model-based
proposals for detecting multiple attackers seem computa-
tionally costly, they likely have only limited applicability.
Our work focuses on detecting Byzantine attackers in the
consensus-based NIDS of [1]. One of our two detection
techniques, outlier detection, derives from outlier methods
monitoring applications of consensus algorithms to coop-
erative spectrum sensing [24]. The second detection tech-
nique, fault detection, derives from a model-based fault
detection technique in process control and control theory
[27]. We also introduce an approach to remove compro-
mised NIDS modules such that the intrusion detection sys-
tem can be returned to a non-compromised state.
The removal of compromised modules conflicts with
some mathematical assumptions about average consensus
algorithms. Indeed, proofs that neighbor to neighbor data
exchanges converge to a consensus are valid under the as-
sumption the network is static. The removal of a compro-
mised module changes the network topology of the NIDS,
thus the system is no longer guarantee to work correctly
even under normal circumstances (no attacks). Here, the
relevant background research comes from dynamic consen-
sus theory concerned with applications of consensus algo-
rithms to dynamic network topologies, facing issues such
as communication time-delays, failing physical links or
Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 195
network nodes and mobile wireless networks [29, 30, 31].
While our objective is only to logically remove (isolate) a
compromised NIDS module, the network conditions this is
creating are quite similar to the work in [32, 33, 34], there-
fore we have drawn our solution more specifically from
these researches.
All the solutions proposed in this paper are based on
local knowledge. Decisions to categorize modules as
compromised and to further removing the compromised
module depend on information gathered from neighbors
only. The computation to detect and remove compromised
modules is therefore fully distributed, thus keeping the
consensus-based NIDS fully distributed. Last, we have de-
signed our defense strategies to protect the NIDS against
a single compromised module. Detecting and removing
multiple compromised modules following potentially co-
ordinated attackers is left to a future work.
The major contributions of this paper include present-
ing the impact of malicious peers on the detection capa-
bility of our consensus based Network Intrusion Detection
Systems (NIDS) scheme. We analyze the vulnerabilities
of consensus-based NIDS by proposing a Byzantine attack
model, which aim to adjust and stealthily manipulate re-
sults. Our defense strategies detect and remove compro-
mised NIDS modules without impacting the logical func-
tionality of the system. We compare these strategies un-
der various detection parameters and network topologies
through extensive simulations and analysis using a real
NIDS and the NSL-KDD data set [35]. Our results demon-
strate that the conducted method can indeed unveil peers
with malicious intend and disruptions in the information
exchange of peer-to-peer NIDS.
In the remainder, we briefly describe the consensus-
based NIDS in [1]. We point out variations of falsification
attacks and outline our two detection techniques to adjust
the trustworthiness of participating peers. Thereafter, we il-
lustrate the salient features of our prediction model to iden-
tify Byzantine peers and describe a practical experiment we
conducted to showcase its functionality.
2 Consensus based NIDS
This section describes the average consensus algorithm.
Next, a summary of the consensus-based NIDS in [1] is
provided. Lastly, we describe our approach to isolate com-
promised modules together with the mathematical back-
ground that supports this approach.
2.1 Average consensus
The average consensus algorithm computes the average
sum 1n
∑n
i=1 xi of some initial values x1, x2, . . . , xn. It is
a distributed algorithm where each process can be viewed
as running independently on a particular node of an undi-
rected graph. Let G = (V,E) be such a graph where
V = {v1, v2, . . . , vn} denotes the set of nodes, and E de-
notes the corresponding set of edges. Graphs have an ad-
jacency structure represented by an n × n adjacency ma-
trix (denoted by A here) where aij = 1 if and only if
(vi, vj) ∈ E, aij = 0 otherwise. The adjacency struc-
ture of G defines for each node vi ∈ G a neighborhood Ni
where Ni = {vj ∈ V |(vi, vj) ∈ E}.
Each node vi of G computes the following recurrence
equation:
xi(t+ 1) = Wiixi(t) +
∑
j∈Ni
Wijxj(t), (1)
where recurrence i is initialized with xi(0) = xi, the ini-
tial value of each node i (from now on we denote node vi
simply by i). The purpose of a consensus algorithm is to
make "consensus", i.e. xi(t) converges asymptotically to
1
n
∑n
i=1 xi for all nodes i ∈ G. As evidence from Equation
(1), each node i obtains xi(t + 1) using only its previous
value xi(t) and the previous values xj(t) of the nodes that
are in the neighborhood of i (xj , j ∈ Ni). Nonetheless, all
nodes converge to 1n
∑n
i=1 xi because the diffusion of the
local averages through neighborhoods that share common
nodes accounts for all nodes computing the global average.
Whether nodes reach consensus and which particular
consensus value is reached is determined by the dynamics
of the linear dynamical system that equation (1) specifies,
which in turn depends on the transition matrixW . Each en-
tryWij of matrixW represents a weight on edge (i, j) ∈ G.
These individual weights have to be chosen carefully to en-
sure convergence, and convergence to a specific value. For
example, in equation (1), making consensus on 1n
∑n
i=1 xi
can be obtained by computing local averages of xi(t) and
xj(t) for j ∈ Ni using Wij = 1|Ni|+1 for (i, j) ∈ G (in-
cluding self-edge (i, i)).
A system as in (1) can reach consensus if the weight
matrix satisfies certain conditions, as stated in [36]. Two
conditions concern our application of average consensus
to network intrusion detection: 1- the undirected graph G
needs to be connected, i.e. there is a path between each pair
of nodes; 2- the weight matrix W must be row stochas-
tic, i.e.
∑n
j=1Wij = 1, the sum of the weights of each
row equal 1 (note that for undirected graph, wij = wji,
therefore W = WT , consequently the weight matrix W
is doubly-stochastic,
∑n
j=1Wij =
∑n
i=1Wji = 1). Sev-
eral weight matrices satisfy these conditions, the following
matrices have been used for the consensus-based NIDS:
– Metropolis-Hasting matrix:
Wij =

1
1+max(di,dj)
if i 6= j and j ∈ Ni
1−
∑
k∈Ni Wik if i = j
0 if i 6= j and j 6∈ Ni
where di = |Ni|.
– Best-constant edge weight matrix:
Wij =
2
λ1(L) + λn−1(L)
196 Informatica 41 (2017) 193–207 M. Toulouse et al.
BA
1
A
3
B
C
B
B
B
D
C
C42
D
D
sensor
sensor
sensor
sensor
Figure 1: Network Intrusion Detection System.
where L is the Laplacian matrix of the NIDS network,
λ1, λn−1 are the first and n− 1 eigenvalues of L.
– Local-degree weights matrix where the weight of an
edge is the largest degree of its two adjacent vertices
Wij =
1
max{di, dj}
.
– Max-degree weight where dmax is the largest degree
of the vertices in the network
Wij =
1
dmax
.
Note for the last three matrices, Wii = 1−
∑
k∈Ni Wik
and Wij = 0 if j 6∈ Ni. Note also these weight matri-
ces guarantee asymptotic convergence, x(t) converges to
1
n
∑n
i=1 xi(0) as t → ∞, we refer to this average consen-
sus algorithm as the asymptotic average consensus. Weight
matrices have an impact on the speed of convergence (the
number of iterations needed to get close enough to the av-
erage sum) [37, 38]. Given an average consensus applica-
tion, it worth to compare the convergence speed of differ-
ent weight matrices to identify the one with the best per-
formance. It worth noticing that the graph topology also
impacts the convergence speed of average consensus algo-
rithms [39].
2.2 Consensus-based NIDS
As pictured in Figure 1, a consensus-based NIDS is a set
of modules each placed strategically on nodes of the moni-
tored computer network such to observe traffic in the corre-
sponding sub-network. Each module consists of traffic sen-
sors that receive copies of all transported packets within the
observed network and calculates an initial local probabil-
ity for observing benign or malignant network traffic. The
NIDS modules observing local network traffic are them-
selves connected by a physical network. Without lost of
generality, we assume that the physical links connecting
pairs of NIDS modules are direct (wired or wireless) phys-
ical links. The NIDS network is modeled by a graph where
each node of the graph represents an NIDS module. It is
assumed that this graph is connected. For the purpose of
analysis and comparisons, we study specific topologies of
NIDS networks, we refer to such specific network as an
NIDS network topology.
2.2.1 Network traffic analysis
The detection method of each NIDS module is "anomaly
based" using the well-known naive Bayes classifier. The
analysis focuses on detecting Distributed Denial of Service
(DDoS) attacks, such as Land-attack, Syn-flood and UDP-
storm. The naive Bayes classifier assess the statistical nor-
mal behavior - the ’likelihood’ of a set of values to occur -
with the help of labeled historic data. Our set of m fea-
tures includes most of the variables offered by the NLS
KDD data set, such as the number of bytes, service, and
number of connections. The probabilities of intrusion is
computed for each of these features. P (oj |h) expresses the
likelihood of the occurrence oj given the historic anoma-
lous ha or normal hn occurrences. Thus, if events receive
the same values than benign or malignant network traffic
during training, they result in a high probability for those.
Assuming conditional independence of the m features, the
joint likelihood P (Oi|h) of NIDS module i is the product
of all feature likelihoods:
P (Oi|h) =
m∏
j=1
P (oj |h). (2)
Each NIDS module locally assigns the joint likelihood, in-
dicating the abnormality of each event.
2.2.2 Consensus phase
Following the sensing and data analysis by the Bayesian
network, each NIDS module enters into a phase where
it computes the average sum of the n log-likelihoods:
1
n
∑n
i=1 xi(0), while communicating only with direct
neighbors. This phase is labeled as the NIDS consensus
phase. Let xi(0) = log(P (Oi|h)) be the initial state of
module i, where xi(0) is the likelihood for module i to
see a certain set of network features. As explained in sec-
tion 2.1, the average sum is computed iteratively and inde-
pendently by each module i as a weighted sum of xi and
the xj for j ∈ Ni as defined in equation (1). We iden-
tify as the consensus loop the iterations of equation (1) and
xi(t + 1) as the consensus value of module i at iteration
t + 1. The consensus phase is the computation performed
by the n consensus loops to reach consensus. This phase is
defined in mathematical terms by the following dynamical
system:
x(t+ 1) = Wx(t), t = 0, 1, . . . (3)
where x(t) is a vector of n entries denoting the n consensus
values at iteration t of the consensus phase, and W is the
weight matrix.
The stopping condition of consensus loop i (also known
as ’convergence parameter’ of the recurrence i) is given by
Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 197
|xi(t+ 1)−xi(t)| < , i.e. when the change in the consen-
sus value from iteration t to iteration t+ 1 is smaller than a
pre-defined threshold value . For weight matrices satisfy-
ing convergence assumptions, the value |xi(t+ 1)− xi(t)|
decreases asymptotically as t → ∞, once this value is
smaller than , the corresponding consensus loop is said
to have converged. A consensus phase is completed once
each consensus loop has converged. The number of itera-
tions of a consensus phase is given by the consensus loop
that needed the largest number of iterations to satisfy the
stopping condition. The convergence speed of a consensus
phase is the number of iterations needed for the consen-
sus phase to complete. The value of  is set such to min-
imize the number of iterations during the consensus phase
while insuring accuracy of the decision about the state of
the network traffic. The consensus phase is synchronous,
all nodes must have completed the consensus loop at it-
eration t before proceeding to execute the consensus loop
iteration t+1. Finally, as a matter of implementation, once
an NIDS module has converged, it stops updating its con-
sensus value but continues to send the last updated value to
its neighbors.
2.3 Removing compromised modules
As discussed in the introduction, once a NIDS module j
has been identified by a neighbor i as compromised, mod-
ule j must be logically disconnected from i to maintain the
integrity of the intrusion detection system. It is relatively
simple to disconnect an NIDS module locally because the
weight matrix W is known to each NIDS module i (or at
least the weights associated to row i are known). Once a
node i has identified a neighbor j as compromised, node i
simply can apply the following change to the weight ma-
trix: Wij = 0. Unfortunately, Wiixi(t) +
∑
j∈Ni Wij(t)
no longer sum up to 1, then W fails to satisfy one of the
two consensus convergence conditions. In order to fully
address this issue, we have revisited the convergence proofs
of average consensus, more specifically the convergence
proofs for dynamic consensus (consensus under dynamic
network topologies).
The consensus algorithm described in Section 2.1 is a
static consensus algorithm because the weight matrix stay
unchanged during the consensus phase. The weight ma-
trix (which is actually a weighted adjacency matrix) mir-
rors the physical network topology underlying the NIDS.
Static consensus cannot be used for applications where the
underlying network topology is dynamic, i.e. where links
or nodes fail, or where nodes enter and leave the network
dynamically such as for wireless ad-hoc network. Dy-
namic consensus theory formally addresses consensus con-
vergence issues arising in dynamical networks. Dynamic
consensus is relevant to our work as the impact on the NIDS
of logically removing a compromised module is (model
wise) the same as a failing node. The convergence theory
of dynamic consensus is the mathematical support to our
solution strategy for the removal of compromised nodes
in a consensus-based NIDS. There are several avenues in
control theory to address dynamic network, the work in
[32, 33, 34] is directly related to our problem.
As stated in section 2.1, the two convergence conditions
the consensus phase of NIDS must satisfy are network con-
nectivity and stochastic weight matrix. In [32], it is shown
that the connectivity condition is surprisingly mild for dy-
namic network topologies, the collection of dynamically
changing topologies during the consensus phase only needs
to be jointly connected to guarantee convergence. In our
work this condition is always satisfied. Only one NIDS
module is removed during a consensus phase, therefore
the collection of network topologies is limited to two. For
the NIDS network topologies tested in the experimentation
section of this paper, each topology is connected.
We have violation of convergence condition related to
the stochastic weight matrix, this is fixed as followed. Once
a node i has identified a neighbor j as compromised, node
i set Wij = 0, thus locally removing the link (i, j). The
situation where Wiixi(t) +
∑
j∈Ni Wij(t) < 1 from set-
ting Wij = 0 is eliminated by increasing the weight of the
self-edge by the same amount Wij : Wii = Wii + Wij .
This solution is only implementable if the information for
updating the weights of row i in W can be computed lo-
cally. This is the case for the Metropolis-Hasting weight
matrix as the weight of each edge depends on the degree
of adjacent nodes. This will not work for weight matri-
ces like the best-constant edge weight matrix or the max-
degree weight matrix where the weights depend on global
information (such as the max-degree node in the network).
Tests in this paper where nodes are logically disconnected
in a NIDS network topology are based on the Metropolis-
Hasting weight matrix.
3 Detection of Byzantine attacks
Byzantine attacks aim at degrading the accuracy of the net-
work intrusion detection system. Accuracy is defined as
follow:
TP + TN
TP + TN + FP + FN
,
where TP (True Positive) is the number of attacks detected
when it is actually an attack; TN (True Negative) is the
number of normals detected when it is actually normal; FP
(False Positive) is the number of attacks detected when it is
actually normal; FN (False Negative) is the number of nor-
mals detected when it is actually an attack. Byzantine at-
tacks of NIDS modules can aim at masking malicious traf-
fic by decreasing the probability of attacks initially com-
puted by the naive Bayesian. Attackers may also increase
the probability of attacks computed by the naive Bayesian,
thus increasing the number of false positives, the reliability
of the system is then questioned by the system administra-
tors.
This section first provides an attack model on the con-
sensus phase, this model is used by tests conducted in the
198 Informatica 41 (2017) 193–207 M. Toulouse et al.
Figure 2: Convergence speeds with and without loop dis-
ruption.
next section. Second, two techniques are described which
aim at identifying compromised NIDS modules.
3.1 Byzantine attack model
Byzantine attacks on the consensus phase of consensus-
based intrusion detection algorithms can take the following
forms [27, 40]:
1. Data falsification attacks: Sensor values are falsified,
thus the consensus loop is initialized with values orig-
inating from falsified network traffic readings;
2. Consensus loop disruptions:
(a) the attacker ignore the consensus value com-
puted at each iteration and keeps transmitting the
same constant c;
(b) the attacker send to its neighbors a falsified con-
sensus value [27].
Figure 2 illustrates the impact of a type 2(a) attack on
the convergence speed of the consensus phase. It plots the
distribution of the convergence speed of 1000 consensus
phases each having only honest NIDS modules (No attack)
versus a scenario where each consensus phase has one com-
promised module sending the same constant value c to its
neighbors (Attack). Figure 2 shows that convergence speed
is much slower in a compromised system, each consensus
phase needing between 250 to 300 iterations to converge,
while in a system without a compromised module consen-
sus phases need between 40 to 125 iterations to converge.
Moreover, NIDS modules in a compromised system fail to
converge to the average consensus 1N
∑N
i=1 xi(0), rather
they all converge to c [41].
In this paper we seek to discover consensus loop disrup-
tion attacks of type 2(b). Equation (4) below models this
type of attacks inside the consensus loop of a compromised
NIDS module:
xj(t+ 1) = Wjjxj(t) +
∑
i∈Nj
Wjixi(t) + uj(t). (4)
This recurrence equation is similar to equation (1) excepts
for the variable uj(t) which models the value selected
by the attacker for modifying the true consensus value of
the compromised node j. The falsified consensus value
xj(t + 1) is sent to all the neighbors of node j at iteration
t + 1. Other Byzantine attack models, including multiple
colluding attackers, are described in [24, 42].
3.2 Detection techniques
We describe two detection techniques that handle consen-
sus loop disruptions of type 2(b) by a single Byzantine at-
tacker. The first detection technique is an outlier detection
procedure. This procedure is executed by each module i
and evaluates at each consensus loop iteration the potential
that a neighbor of module i is compromised. The second
detection technique is an adaptation to cyber-attacks of a
model-based fault-detection technique in process engineer-
ing and control theory. Like the first one, it is a procedure
executed by each module, observing its neighbors, seeking
to identify a compromised one.
3.2.1 Outlier detection
Outlier detection techniques have been applied to detect
Byzantine attacks in wireless sensor networks [43]. These
techniques use distance thresholds between the value xj(t)
sent by a neighbor j to node i and some reference value
ri. For example, if ri(t) = xi(t), neighbor j is flagged
as compromised if |xj(t) − xi(t)| > λ for some threshold
value λ. However this idea had to be refined. For exam-
ple, a unique predefined threshold for all nodes may eas-
ily be discovered by intruders. Furthermore, as nodes of
a consensus algorithm converge to a same value, the abso-
lute differences |xj(t) − xi(t)| between two nodes i and j
converge to zero as t → ∞, rendering the outlier detec-
tion potentially insensitive when the absolute differences
get smaller than λ.
Adaptive thresholds have been proposed to address the
above issues [23, 24]. It consists for each node i to com-
pute a local threshold λi and to adapt the threshold at each
consensus iteration to the reduction of absolute differences
|xj(t)− xi(t)|. In [23], the threshold
λi(t+ 1) =
∑
j∈Ni |xj(t+ 1)− xi(t+ 1)|∑
j∈Ni |xj(t)− xi(t)|
λi(t) (5)
(for properly initialized λi(0)) is computed by each node
i and at each iteration of the consensus phase. The rule
in equation (5) computes λi using the diffusion dynam-
ics of consensus algorithms, so unless the attacker can get
multi-hops information access, it cannot foresee the value
of its neighbor thresholds. Consequently, the attacker can-
not adapt its consensus loop disruption attack to keep the
values under the radar of the detection procedure. As the
network converges towards consensus, the value λ con-
verges toward zero, leading to the attackers to be eventually
filter out.
Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 199
Note that λi(t) partitions neighbors of node i into two
sets, those neighbors j that have a deviation |xj(t) −
xi(t)| ≥ λi(t) are considered suspicious, they constitute
the neighborhoodNFi of states that have less weight in the
computation of the consensus value xi(t+ 1):
xi(t+ 1) = xi(t) + 
∑
j∈NTi
xj(t) +

a
∑
j∈NFi
xj(t)
for some constant a. Our outlier detection method com-
putes the threshold λ as in equation (5). Those neighbors j
that have a deviation |xj(t)− xi(t)| ≥ λi(t) are flagged as
suspicious. We use a majority rule similar to [24] to convert
the status of a neighbor NIDS module j from suspicious to
attacker. Let B be the number of common neighbors be-
tween module i and module j. If more then dB2 e neighbors
of module i report j as suspicious then module j is con-
sidered as compromised, it is disconnected/removed from
the intrusion detection system. Note that we assume a sin-
gle attacker, if the majority rule identifies more than one
neighbor as compromised, the one with the largest devia-
tion is disconnected from the NIDS network.
3.2.2 Model-based fault detection
Fault detection is a field of control engineering concerned
with identifying and locating faulty components in a sys-
tem. The techniques in this field essentially compare mea-
surements of the actual behavior of a system with its an-
ticipated behavior. In model-based fault detection, the an-
ticipated behavior is described using mathematical mod-
els [44], the measured system variables are compared with
their model estimates. Comparisons between the system
and the model show deviations when there is a fault in the
real system. Such difference between the system and its
model is called residual or residual vector. There exist
several implementations of the model-based approach, the
observer-based technique [45] 1- seeks to discriminate be-
tween deviations caused by faults in the real process from
those caused by the estimations; 2- provides a residual vec-
tor that indicates the faulty system components (so called
directional residual). Observer-based approaches to cyber-
security have been proposed recently in different contexts
[46, 47, 48, 49, 50]. We focus more specifically on appli-
cations of observer-based fault detection to identify Byzan-
tine attackers in consensus-based algorithms [26, 27].
In order to detect Byzantine attackers during the consen-
sus phase of the NIDS, the design of the consensus loop
of each NIDS module is modified to include new matrices
based computation that estimate the consensus state vector
x(t), we name observer this new function of the consensus
loop. At each iteration of the consensus loop, the observer
computes a state vector xo(t) estimating x(t), where x(t)
is the vector storing the consensus values at iteration t of
the consensus loop. We first model the consensus loop dis-
ruption attack of equation (4) in matrix form:
x(t+ 1) = Wx(t) + Inu(t) (6)
where In is the n-dimension identity matrix, and where
ui(t) = 0 whenever NIDS module i behave normally. The
observer requires inputs from the state vector x(t), i.e. the
values xj(t) ∈ x(t) where j ∈ Ni. These values are stored
in a vector yi. The consensus loop of each NIDS module i
is now defined as follow:
x(t+ 1) = Wx(t) + Inu(t)
yi(t) = Cix(t)
(7)
where Ci is a (degi + 1) × N matrix in which entry
Ci[k, l] = 1 if l ∈ Ni, otherwise Ci[k, l] = 0. The vec-
tor yi(t) has (degi + 1) entries, each entry j of yi(t) stores
the state xj(t) at time t of modules j ∈ Ni.
Equation (7) represents the consensus loop of a given
module i as if it could access all the consensus values at
iteration t, though in fact module i can only access xi(t)
and xj(t) for j ∈ Ni. The other entries of vector x(t) are
not needed during the computation performed by the re-
currence relation of node i, so it is not incorrect to model
these entries as if they were available. Note that each
NIDS module i knows the consensus matrix W , the ma-
trix Ci, xj(t) ∈ x(t) for j ∈ Ni, and the identity ma-
trix In. However, the set of non-zero ui is unknown to the
non-malicious modules. To detect a malicious neighbor of
module i, the consensus loop of "each module" computes
the following matrix operations [27]:
z(t+ 1) = (W +GCi)z(t)−Gyi(t)
xo(t) = Lz(t) +Kyi(l)
(8)
where z(t) is the state of the observer and xo(t) is the esti-
mation by the observer of module i of the consensus state
x(t). The matrices to compute z(t + 1) and xo(t) are de-
fined as follow: G = −WNi , K = CTi , L = In − KCi,
where WNi are the columns of W with indexes in Ni. The
system in (8) has roots in the observability theory of con-
trol theory, a detailed analysis of this system is beyond the
scope of this paper, we refer to [45] for an historical de-
velopment and analysis of observer-based fault detection
systems. The analysis of (8) can be simplified as the con-
sensus system (7) satisfies some conditions [26]. It can be
show that as t → ∞, xo(t) → x(t), consequently the esti-
mation error e(t) = xo(t) − x(t) converges to 0. We are
also given that equation (8) under the consensus system in
(7) simplifies to:
xoj(t) =
{
xj(t) if j = i or j ∈ Ni
zj(t) otherwise
(9)
and that the state of the observer z(t+ 1) can be expressed
in terms of the consensus matrix [27]:
z(t+ 1) = Wxo(t). (10)
The iteration error ε(t):
ε(t) = |xo(t+ 1)−Wxo(t)|
200 Informatica 41 (2017) 193–207 M. Toulouse et al.
can then be used as residual vector. From (9) and (10),
εj(t) = 0 for j 6= i and j 6∈ Ni. If εj(t) 6= 0, either
xoj(t) 6= xj(t) (estimation error is greater than 0), or uj 6=
0. Since the estimation error dissipates as t→∞, we have
(xo(t+ 1)−Wxo(t))→ Inu(t) as t→∞. If uj 6= 0 for
some j ∈ {1, . . . , n} then (xoj(t+ 1)−Wxoj(t))→ uj(t),
the corresponding module j is detected as compromised.
Together with the consensus loop in (7), the observer
defined in (8) provides an algorithm where each NIDS
module can detect whether one of its neighbor sends fal-
sified consensus data. Each module i build a consensus
system and an observer as described in equations (7) and
(8). At each consensus iteration, each module i computes
ε(t) = |xo(t + 1) − Wxo(t)|. If εj(t) 6= 0 then mod-
ule j ∈ Ni is compromised. Module i then removes log-
ically module j from its neighborhood by modifying it’s
weight matrix according to the description in Section 2.3,
thus stopping the injection of an external input by module
j into the network intrusion detection system.
4 Empirical analysis
The above two Byzantine attacks detection techniques help
the NIDS coping with adversarial environments by detect-
ing compromised modules. In this section we analyze and
compare the behavior of each technique. For example,
these techniques have a computational cost, we measure
the overhead for running each technique. We measure how
fast attacks are detected and model the accuracy of the de-
cisions made by the NIDS under each detection method.
Last, as the removal of a compromised module is obtained
by changing the weight matrix and the network topology,
we measure whether these changes have any impact on the
convergence speed of the consensus phases, i.e. whether
the system returns to its full functioning capabilities after
removing compromised modules.
To execute this empirical analysis, the two Byzantine at-
tack identification techniques described in Section (3) have
been coded as part of the consensus phase of the NIDS sim-
ulations described in [1]. We have run simulations for the
following NIDS network topologies: rings with 9 and 25
nodes (NIDS modules), 2-dimensional torus with 9 and 25
nodes, Petersen graph (10 nodes 15 edges) and several ran-
dom graphs having the same number of vertices and edges
as in the Petersen graph. A simulation consists to execute
1000 iterations of one of the above NIDS network topolo-
gies. In one iteration, each NIDS module of the network
topology reads the local network traffic from an entry of
the NSL-KDD data set, performs an Bayesian analysis of
the local traffic, then executes its consensus loop until con-
vergence. Note, we have filtered attacks in the NSL-KDD
data set to retain only denial of service attacks.
The consensus phase is implemented as follow. The
Bayesian analysis of the local network traffic by module
i returns two values: pAi the probability that the observed
traffic at module i is intrusive; pNi the probability the ob-
served traffic at module i is normal. These values are
used to initialize the consensus loop of the corresponding
module i: xAi (0) = log(pAi) and x
N
i (0) = log(pNi),
for i = 1..n. During the consensus phase, for simula-
tions involving the outlier detection technique, each NIDS
module i computes the following recurrence relations until
|xAi (t+ 1)− xAi (t)| <  and |xNi (t+ 1)− xNi (t)| < :
xAi (t+ 1) = Wiix
A
i (t) +
∑
j∈Ni
Wijx
A
j (t) + ui(t) (11)
xNi (t+ 1) = Wiix
N
i (t) +
∑
j∈Ni
Wijx
N
j (t) + ui(t). (12)
Similarly, in simulations involving the fault detection tech-
nique, each NIDS module i computes the solutions for the
following iterative systems until each recurrence of the sys-
tems has converged:
xA(t+ 1) = WxA(t) + IuA
yAi (t) = Cix
A(t)
(13)
xN (t+ 1) = WxN (t) + IuN
yNi (t) = Cix
N (t).
(14)
The matrix operations described in Equation (8) are also
computed at each iteration of the consensus loop of module
i in simulations involving the fault detection technique.
Once a consensus phase is completed, each NIDS mod-
ule i decides whether to raise an alert or not based on
its consensus approximation ratio x
A
i (t)
xNi (t)
of the actual ra-
tio
∑N
i=1 log(pAi )
n /
∑n
i=1 log(pNi )
n and some predefined alert
value ratio. As each module converges asymptotically to
the same actual ratio
∑N
i=1 log(pAi )
n /
∑n
i=1 log(pNi )
n , all mod-
ules reach a same decision, which constitutes a form of co-
ordinated response to perceived anomalies in the network
traffic.
The consensus loop disruption of the attack model 2(b)
in Section 3.1 is implemented as follow. Anomaly-based
intrusion detection systems tend to have high false positive
rates. We simulate attacks that aim to further increase the
number of false positives. Attacks inject positive values in
the consensus loop component (11) or (13). At each iter-
ation of a simulation, a to be compromised NIDS module
j is selected randomly, uAj is then assigned with a posi-
tive value. The magnitude of uAj has to be large enough
to falsify the decision at the end of the consensus phase
(i.e. raise an alert when traffic is normal), "if" the con-
sensus loop disruption attack is not detected. For exam-
ple, uAj = 0.0005 is to small, it does not have an impact
on the decision. However, a value such as uAj = 0.5 can
cause each module of the system to converge to an approxi-
mation x
A
i (t)
xNi (t)
>
∑N
i=1 log(pAi )
n /
∑n
i=1 log(pNi )
n , thus possibly
leading the NIDS to raise an alert when in fact there is no
attack.
The value uAj = 0.5 is also suitable to obtain
meaningful test results for the following reason. The
Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 201
values pAi and pNi returned by the Bayesian anal-
ysis of a module i are the product of likelihoods∏m
j=1 P (oj |h), as the number of features is large, the
product of likelihoods are very small. During the con-
sensus phase, neighbor NIDS modules exchange log-
likelihoods, which are in the range between -20 and
-55. So uAj = 0.5 is a relativity small external input dur-
ing the consensus phase. However, it is large enough so
that our two detection techniques can always detect this
attack, but failing to detect it soon enough can lead the
consensus phase to converge to values quite different from
1
n
∑n
i=1 xi(0).
In the following sections, we first evaluate the computa-
tional cost of running each of the two detection techniques.
Subsidiary, we also report the number of consensus itera-
tions needed to detect a compromised module. Next we
analyze the efficiency of the detection techniques to pre-
vent the occurrence of false positives at the conclusion of
a consensus phase. Finally we analyze the impact of our
technique to remove a compromised NIDS module on the
convergence speed of consensus phases.
4.1 Computational costs
Table 1 reports the computational cost of running each de-
tection technique. All the simulations are executed while
no attack take place, these tests measure uniquely the over-
head for running the code implementing the two detection
techniques. The column "Cost" reports the time in millisec-
onds for running the NIDS network simulation for 1000 it-
erations. In Table 1, rows "no detection" give the cost of
running a NIDS simulation without the execution of any
detection code. Rows "outlier" and "fault" give the cost of
running NIDS modules while also executing respectively
the code for the outlier method and the fault method. The
higher costs of the detection techniques compared to "no
detection" for the same network size and topology reflects
the cost for protecting the consensus-based NIDS with the
corresponding detection techniques. Table 1 shows that the
computational overhead for outlier is clearly less than for
the fault detection method. These results were expected,
each consensus loop iteration of the fault detection method
runs several matrix operations compared to simple scalar
operations for the outlier method.
4.2 Detection speed
Figures 3 to 8 detail with which rapidity, detection speed,
the two detection techniques identify compromised mod-
ules. Each figure corresponds to a different network topol-
ogy. The values on the x axis are the number of consen-
sus iterations needed before the compromised module is
identified. The y axis displays the percentage of the 1000
consensus phases that needed a given number of consen-
sus iterations to detect a compromised module. These fig-
ures clearly show that the fault detection approach needs
fewer iterations to detect Byzantine attacks. Combining the
Table 1: Consensus-based NIDS computational simulation
costs in milliseconds.
Topology Size Detection Cost
Ring
9
no detection 0.050
outlier 0.276
fault 0.921
25
no detection 0.101
outlier 1.131
fault 3.286
Torus
9
no detection 0.027
outlier 0.121
fault 1.327
25
no detection 0.043
outlier 0.567
fault 6.055
Petersen 10
no detection 0.005
outlier 0.135
fault 0.597
Random 10
no detection 0.013
outlier 0.290
fault 1.268
computational cost in Table 1, we observe that the outlier
method has a more favorable computational overhead but
requires more iterations to detect compromised modules.
Figure 3: Detection speed of ring topology 9 nodes.
Figure 4: Detection speed of ring topology 25 nodes.
202 Informatica 41 (2017) 193–207 M. Toulouse et al.
Figure 5: Detection speed of torus topology 9 nodes.
Figure 6: Detection speed of torus topology 25 nodes.
Figure 7: Detection speed of Petersen graph.
4.3 Intrusion detection accuracy
Disruption of the consensus loops by injecting external in-
puts has an impact on the accuracy of the decision made
by the NIDS about the state of the network traffic. Table
2 measures how effective the two detection techniques are
at maintaining the accuracy of the consensus-based NIDS.
The "no attack" rows report the accuracy of the NIDS in
a non-adversarial environment. The "no detection" rows
report the accuracy of the NIDS when attacks take place
Figure 8: Detection speed of Random graphs.
while the NIDS is not protected. The "outlier" and "fault"
rows report respectively the accuracy of NIDS protected by
the outlier and fault detection methods.
The results of Table 2 are obtained without changing the
weight matrix and network topology once a compromised
NIDS module is identified (static consensus). Let l be the
iteration of the consensus phase where module i identifies
a neighbor module j as compromised. For t > l, module i
applies the following update rule:
xAi (t+ 1) = Wiix
A
i (t) +
∑
(k∈Ni∧k 6=j)
Wikx
A
k (t)
+Wijx
A
j (t)− 0.5.
This update is possible since uAj = 0.5 is known in the con-
text of our simulations, though it is not know which module
is compromised in this way. Module i removes 0.5 from the
value sent by the compromised module j, therefore mod-
ule i update its state with the true consensus values of its
neighbors.
Table 2 shows the outlier detection method less ac-
curate compared to the fault-detection method, this even
Byzantine attacks are always detected and the compro-
mised NIDS module neutralized. These results are ex-
plained by the number of consensus loop iterations needed
to detect attackers. Figures 3 to 8 show the outlier method
needing more iterations to detect compromised modules.
The more iterations it takes to detect a compromised mod-
ule, the more data injections take place prior to detection of
the compromised module and the more time injected values
have to diffuse across the NIDS modules, which cause the
NIDS decision to till the wrong way more frequently in the
case of the outlier method.
4.4 Convergence speed
This section analyzes the impact of removing a module
while the NIDS computes consensus states. According to
section 2.3, the technique we propose to isolate a compro-
mised module satisfies the average consensus convergence
conditions. Rows of the weight matrices sum up to 1. Each
Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 203
Table 2: Accuracy of the NIDS.
Topology Size Detection TP TN FP FN
Ring
9
no attack 466 520 14 0
no detection 521 0 479 0
outlier 522 404 74 0
fault 456 525 4 15
25
no attack 527 473 0 0
no detection 475 0 525 0
outlier 497 503 58 0
fault 506 489 0 5
Torus
9
no attack 493 491 16 0
no detection 499 0 501 0
outlier 495 438 67 0
fault 478 511 0 11
25
no attack 492 507 1 0
no detection 497 0 503 0
outlier 491 456 53 0
fault 518 450 32 0
Petersen 10
no attack 501 487 12 0
no detection 481 0 519 0
outlier 477 458 65 0
fault 481 516 0 3
Random 10
no attack 451 533 16 0
no detection 485 0 515 0
outlier 503 432 65 0
fault 526 464 0 10
NIDS network topology tested in this empirical analysis
section is such that it is still connected even after one mod-
ule is removed. However, as our approach changes the
weight matrix and the NIDS network topology, two factors
that could impact the convergence speed, we still need to
analyze the consensus phase convergence speed when mod-
ules are removed. In this section we compare the consen-
sus phase convergence speed of the static consensus imple-
mentation of Section 4.3 running the Metropolis-Hasting
weight matrix with the convergence speed when the con-
sensus phase is implemented with the dynamic consensus
procedure introduced in Section 2.3.
Figures 9 to 13 compare the convergence speed of static
versus dynamic consensus for the outlier detection method
while figures 14 to 18 compare the convergence speed
of static versus dynamic consensus for the fault detec-
tion method. As we can see from figures 14 to 18, there
is no significant differences in the convergence speed of
static and dynamic consensus for the fault-based detection
method, except for the Petersen graph. With the outlier de-
tection method, as shown in figures 9 to 13, dynamic con-
sensus converges faster for some of the network topologies.
It is not entirely clear why the convergence speed is bet-
ter with dynamic consensus and some specific outlier simu-
lations. Nonetheless, figures 9 to 18 show no significant de-
crease in the convergence speed of consensus phases once
a module has been isolated. As accuracy is not impacted
by the removal of a module, this is enough to conclude that
the intrusion detection system returns to a fully functioning
state.
Figure 9: Convergence: dynamic topology, outlier detec-
tion, ring topology 9 nodes.
Figure 10: Convergence: dynamic topology, outlier detec-
tion, ring topology 25 nodes.
Figure 11: Convergence: dynamic topology, outlier detec-
tion, torus topology 9 nodes.
5 Conclusion
Local data exchanges of consensus-based distributed ap-
plications can be hacked by Byzantine attackers falsifying
computed consensus information. Several solutions have
been proposed in the literature that address Byzantine at-
tacks on consensus algorithms. We have adapted two of
these solutions, one from model-based fault detection and
one from outlier detection to protect a consensus-based net-
204 Informatica 41 (2017) 193–207 M. Toulouse et al.
Figure 12: Convergence: dynamic topology, outlier detec-
tion, torus topology 25 nodes.
Figure 13: Convergence: dynamic topology, outlier detec-
tion, Petersen graph.
Figure 14: Convergence: dynamic topology, fault detec-
tion, ring topology 9 nodes.
work intrusion detection system. We have also applied re-
sults from dynamic consensus theory to derive a simple ap-
proach to isolate compromised modules from the network
while continuing to satisfy the mathematical assumptions
requested for convergence of the consensus phase.
Our results show the two methods we propose can be
used to detect consensus loop disruptions and prevent falsi-
fications of NIDS network traffic assessments. Though pre-
liminary, our results also show significant computational
Figure 15: Convergence: dynamic topology, fault detec-
tion, ring topology 25 nodes.
Figure 16: Convergence: dynamic topology, fault detec-
tion, torus topology 9 nodes.
Figure 17: Convergence: dynamic topology, fault detec-
tion, torus topology 25 nodes.
costs of these approaches either in terms of the number of
iterations to detect attacks (outlier detection) or in terms of
the computational cost of each iteration (model-based de-
tection). This might raise issues for deploying consensus-
based NIDS in suitable environments such as wireless ad
hoc networks.
Future work will address both protecting the consensus-
based NIDS against disruptive attacks as well as getting the
Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 205
Figure 18: Convergence: dynamic topology, fault detec-
tion, Petersen graph.
system closer to deployment in wireless network environ-
ments. We will work on reducing computational cost, by
speeding up for example the consensus phase, i.e. reduc-
ing the number of iterations needed for modules to come
with agreed decisions. This will impact Byzantine fault
detection which will need to be done at earlier stages and
at a smaller computational cost. Byzantine fault detection
will be broaden to other attack models, involving more than
one compromised module, and possibly colluding attack-
ers. Addressing multiple attackers seems achievable with-
out too much research efforts using outlier or reputation-
based methods. On the other hand, current model-based
approaches in control theory seem too computationally de-
manding and will need more research before they can be
used in a deployed system. Finally, we intend to broaden
the cooperation among NIDS modules. This depends on
consensus computing more functions of the initial values
provided by the analysis phase. There is a wide range of
functions that can be computed using distributed iterative
methods similar to average consensus. This will bring more
versatility in detecting network intrusions and allow for a
wide range coordinated responses to address detected ma-
licious network activities.
References
[1] M. Toulouse, B. Q. Minh, and P. Curtis, “A
consensus based network intrusion detection system.”
in IT Convergence and Security (ICITCS), 2015 5th
International Conference on. IEEE, 2015, pp. 1–6.
[Online]. Available: http://dblp.uni-trier.de/db/conf/
icitcs/icitcs2015.html#ToulouseMC15
[2] A. Patel, M. Taghavi, K. Bakhtiyari, and J. Ce-
lestino JúNior, “Review: An intrusion detection
and prevention system in cloud computing: A
systematic review,” J. Netw. Comput. Appl., vol. 36,
no. 1, pp. 25–41, Jan. 2013. [Online]. Available:
http://dx.doi.org/10.1016/j.jnca.2012.08.007
[3] C. V. Zhou, C. Leckie, and S. Karunasekera, “A
survey of coordinated attacks and collaborative in-
trusion detection,” Computers & Security, vol. 29,
no. 1, pp. 124 – 140, 2010. [Online]. Avail-
able: http://www.sciencedirect.com/science/article/
pii/S016740480900073X
[4] E. Vasilomanolakis, S. Karuppayah, M. Mühlhäuser,
and M. Fischer, “Taxonomy and survey of collab-
orative intrusion detection,” ACM Comput. Surv.,
vol. 47, no. 4, pp. 55:1–55:33, May 2015. [Online].
Available: http://doi.acm.org/10.1145/2716260
[5] S. R. Snapp, J. Brentano, G. V. Dias, T. L. Goan,
L. T. Heberlein, C. L. Ho, K. N. Levitt, B. Mukher-
jee, S. E. Smaha, T. Grance et al., “Dids (distributed
intrusion detection system)-motivation, architecture,
and an early prototype,” in Proceedings of the 14th
national computer security conference, vol. 1. Cite-
seer, 1991, pp. 167–176.
[6] T. Bass, “Multisensor data fusion for next generation
distributed intrusion detection systems,” in In Pro-
ceedings of the IRIS National Symposium on Sensor
and Data Fusion, 1999, pp. 24–27.
[7] R. Janakiraman, M. Waldvogel, and Q. Zhang, “In-
dra: A peer-to-peer approach to network intrusion
detection and prevention,” in Enabling Technologies:
Infrastructure for Collaborative Enterprises, 2003.
WET ICE 2003. Proceedings. Twelfth IEEE Interna-
tional Workshops on. IEEE, 2003, pp. 226–231.
[8] C. V. Zhou, S. Karunasekera, and C. Leckie, “A peer-
to-peer collaborative intrusion detection system,” in
2005 13th IEEE International Conference on Net-
works Jointly held with the 2005 IEEE 7th Malaysia
International Conf on Communic, vol. 1, Nov 2005,
pp. 118–123.
[9] M. Locasto, J. J. Parekh, A. D. Keromytis, and S. J.
Stolfo, “Towards collaborative security and p2p in-
trusion detection,” in In Proceedings of the IEEE In-
formation Assurance Workshop (IAW, 2005, pp. 333–
339.
[10] M. Marchetti, M. Messori, and M. Colajanni,
Peer-to-Peer Architecture for Collaborative Intrusion
and Malware Detection on a Large Scale. Berlin,
Heidelberg: Springer Berlin Heidelberg, 2009, pp.
475–490. [Online]. Available: http://dx.doi.org/10.
1007/978-3-642-04474-8_37
[11] N. A. Lynch, Distributed Algorithms. San Francisco,
CA, USA: Morgan Kaufmann Publishers Inc., 1996.
[12] T. Vicsek, A. Czirók, E. Ben-Jacob, I. Cohen, and
O. Shochet, “Novel type of phase transition in a sys-
tem of self-driven particles,” Phys. Rev. Lett., vol. 75,
pp. 1226–1229, Aug 1995. [Online]. Available:
http://link.aps.org/doi/10.1103/PhysRevLett.75.1226
206 Informatica 41 (2017) 193–207 M. Toulouse et al.
[13] R. Saber and R. Murray, “Consensus protocols for
networks of dynamic agents,” in American Control
Conference, 2003. Proceedings of the 2003, vol. 2,
June 2003, pp. 951–956.
[14] A. Fagiolini, M. Pellinacci, M. Valenti, G. Dini, and
A. Bicchi, “Consensus-based distributed intrusion de-
tection for multi-robot systems,” in Proc. IEEE Int.
Conf. on Robotics and Automation, 2008, pp. 120 –
127.
[15] J. Tsitsiklis, D. Bertsekas, and M. Athans, “Dis-
tributed asynchronous deterministic and stochastic
gradient optimization algorithms,” Automatic Con-
trol, IEEE Transactions on, vol. 31, no. 9, pp. 803–
812, Sep. 1986.
[16] S. Li, G. Oikonomou, T. Tryfonas, T. Chen,
and L. Xu, “A distributed consensus algorithm
for decision-making in service-oriented internet of
things,” Transactions on Industrial Informatics,
vol. 10, no. 2, pp. 1461–1468, 2014. [Online]. Avail-
able: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?
arnumber=6740862
[17] A. Narayanan, J. Bonneau, E. Felten, A. Miller, and
S. Goldfeder, Bitcoin and Cryptocurrency Technolo-
gies: A Comprehensive Introduction. Princeton, NJ,
USA: Princeton University Press, 2016.
[18] I. F. Akyildiz, B. F. Lo, and R. Balakrishnan,
“Cooperative spectrum sensing in cognitive radio
networks: A survey,” Phys. Commun., vol. 4,
no. 1, pp. 40–62, Mar. 2011. [Online]. Available:
http://dx.doi.org/10.1016/j.phycom.2010.12.003
[19] G. Xiong and S. Kishore, “Consensus-based dis-
tributed detection algorithm in wireless ad hoc net-
works,” in Signal Processing and Communication
Systems, 2009. ICSPCS 2009. 3rd International Con-
ference on, Sept 2009, pp. 1–6.
[20] K. Avrachenkov, M. E. Chamie, and G. Neglia, “A
local average consensus algorithm for wireless sensor
networks,” in 2011 International Conference on Dis-
tributed Computing in Sensor Systems and Workshops
(DCOSS), June 2011, pp. 1–6.
[21] M. Pease, R. Shostak, and L. Lamport, “Reaching
agreement in the presence of faults,” J. ACM, vol. 27,
no. 2, pp. 228–234, Apr. 1980. [Online]. Available:
http://doi.acm.org/10.1145/322186.322188
[22] L. Lamport, R. Shostak, and M. Pease, “The
byzantine generals problem,” ACM Trans. Program.
Lang. Syst., vol. 4, no. 3, pp. 382–401, Jul.
1982. [Online]. Available: http://doi.acm.org/10.
1145/357172.357176
[23] S. Liu, H. Zhu, S. Li, X. Li, C. Chen, and
X. Guan, “An adaptive deviation-tolerant secure
scheme for distributed cooperative spectrum sensing,”
in 2012 IEEE Global Communications Conference,
GLOBECOM 2012, Anaheim, CA, USA, December
3-7, 2012, 2012, pp. 603–608. [Online]. Available:
http://dx.doi.org/10.1109/GLOCOM.2012.6503179
[24] Q. Yan, M. Li, T. Jiang, W. Lou, and Y. T. Hou, “Vul-
nerability and protection for distributed consensus-
based spectrum sensing in cognitive radio networks,”
in INFOCOM, 2012 Proceedings IEEE. IEEE, 2012,
pp. 900–908.
[25] W. Zeng and M.-Y. Chow, “A reputation-based secure
distributed control methodology in D-NCS.” IEEE
Trans. Industrial Electronics, vol. 61, no. 11, pp.
6294–6303, 2014. [Online]. Available: http://dblp.
uni-trier.de/db/journals/tie/tie61.html#ZengC14
[26] F. Pasqualetti, A. Bicchi, and F. Bullo, “Consensus
computation in unreliable networks: A system the-
oretic approach,” IEEE Transactions on Automatic
Control, vol. 57, no. 1, pp. 90 – 104, Jan. 2012.
[27] ——, “Distributed intrusion detection for secure con-
sensus computations,” in Decision and Control, 2007
46th IEEE Conference on, Dec 2007, pp. 5594–5599.
[28] S. Sundaram and C. N. Hadjicostis, “Distributed
function calculation via linear iterative strategies in
the presence of malicious agents,” IEEE Transactions
on Automatic Control, vol. 56, no. 7, pp. 1495–1508,
July 2011.
[29] L. Xiao, S. Boyd, and S.-J. Kim, “Distributed av-
erage consensus with least-mean-square deviation,”
Journal of Parallel and Distributed Computing,
vol. 67, no. 1, pp. 33 – 46, 2007. [Online]. Avail-
able: http://www.sciencedirect.com/science/article/
pii/S0743731506001808
[30] M. Zhu and S. Martínez, “Discrete-time dy-
namic average consensus,” Automatica, vol. 46,
no. 2, pp. 322 – 329, 2010. [Online]. Avail-
able: http://www.sciencedirect.com/science/article/
pii/S0005109809004828
[31] R. Olfati-Saber and R. M. Murray, “Consensus
problems in networks of agents with switching
topology and time-delays,” Automatic Control, IEEE
Transactions on, vol. 49, no. 9, pp. 1520–1533, Sep.
2004. [Online]. Available: http://dx.doi.org/10.1109/
tac.2004.834113
[32] L. Xiao, S. Boyd, and S. Lall, “Distributed average
consensus with time-varying metropolis weights,”
2006, unpublished. [Online]. Available: http://web.
stanford.edu/~boyd/papers/pdf/avg_metropolis.pdf
[33] L. Xiao, S. Boyd, and S. Lall, “A space-time diffusion
scheme for peer-to-peer least-squares estimation,” in
Proceedings of the Fifth International Conference
Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 207
on Information Processing in Sensor Networks,
IPSN 2006, Nashville, Tennessee, USA, April 19-
21, 2006, 2006, pp. 168–176. [Online]. Available:
http://doi.acm.org/10.1145/1127777.1127806
[34] L. Xiao, S. Boyd, and S. Lall, “A scheme
for robust distributed sensor fusion based on
average consensus,” in Proceedings of the 4th
International Symposium on Information Processing
in Sensor Networks, ser. IPSN ’05. Piscataway,
NJ, USA: IEEE Press, 2005. [Online]. Available:
http://dl.acm.org/citation.cfm?id=1147685.1147698
[35] M. Tavallaee, E. Bagheri, W. Lu, and A. A.
Ghorbani, “A detailed analysis of the kdd cup
99 data set,” in Proceedings of the Second
IEEE International Conference on Computational
Intelligence for Security and Defense Applica-
tions, ser. CISDA’09. Piscataway, NJ, USA:
IEEE Press, 2009, pp. 53–58. [Online]. Available:
http://dl.acm.org/citation.cfm?id=1736481.1736489
[36] L. Xiao, S. Boyd, and S.-J. Kim, “Distributed
average consensus with least-mean-square deviation,”
J. Parallel Distrib. Comput., vol. 67, no. 1,
pp. 33–46, Jan. 2007. [Online]. Available: http:
//dx.doi.org/10.1016/j.jpdc.2006.08.010
[37] A. Olshevsky and J. N. Tsitsiklis, “Convergence
speed in distributed consensus and averaging,”
SIAM J. Control Optim., vol. 48, no. 1, pp.
33–55, Feb. 2009. [Online]. Available: http:
//dx.doi.org/10.1137/060678324
[38] L. Xiao and S. Boyd, “Fast linear iterations for
distributed averaging,” Systems and Control Letters,
vol. 53, pp. 65–78, 2003.
[39] S. Kar and J. M. F. Moura, “Topology for global av-
erage consensus,” October 2006, pp. 276–280.
[40] B. Kailkhura, S. Brahma, and P. K. Varshney, “Data
falsification attacks on consensus-based detection
systems,” IEEE Transactions on Signal and Informa-
tion Processing over Networks, vol. 3, no. 1, pp. 145–
158, March 2017.
[41] W. Ben-Ameur, P. Bianchi, and J. Jakubowicz,
“Robust average consensus using total variation
gossip algorithm,” in 6th International ICST Con-
ference on Performance Evaluation Methodologies
and Tools, Cargese, Corsica, France, October 9-
12, 2012, 2012, pp. 99–106. [Online]. Available:
http://dx.doi.org/10.4108/valuetools.2012.250316
[42] S. Mi, H. Han, C. Chen, J. Yan, and X. Guan, “A
secure scheme for distributed consensus estimation
against data falsification in heterogeneous wireless
sensor networks,” Sensors, vol. 16, no. 2, p. 252,
2016. [Online]. Available: http://www.mdpi.com/
1424-8220/16/2/252
[43] V. P. Illiano and E. C. Lupu, “Detecting malicious
data injections in wireless sensor networks: A
survey,” ACM Comput. Surv., vol. 48, no. 2,
pp. 24:1–24:33, Oct. 2015. [Online]. Available:
http://doi.acm.org/10.1145/2818184
[44] R. Isermann, “Model-based fault-detection and diag-
nosis - status and applications,” Annual Reviews in
Control, vol. 29, pp. 71–85, 2005.
[45] J. Chen, J. R. Patton, and H.-Y. Zhang, “Design of un-
known input observers and robust fault detection fil-
ters,” International Journal of Control, vol. 63, no. 1,
pp. 85–105, 1996.
[46] Z. A. Biron, P. Pisu, and B. HomChaudhuri,
“Observer design based cyber security for cyber
physical systems,” in Proceedings of the 10th
Annual Cyber and Information Security Research
Conference, ser. CISR ’15. New York, NY,
USA: ACM, 2015, pp. 6:1–6:6. [Online]. Available:
http://doi.acm.org/10.1145/2746266.2746272
[47] D. Ding, Z. Wang, D. W. C. Ho, and G. Wei,
“Observer-based event-triggering consensus control
for multiagent systems with lossy sensors and cyber-
attacks,” IEEE Transactions on Cybernetics, vol. PP,
no. 99, pp. 1–12, 2016.
[48] F. Pasqualetti, R. Carli, A. Bicchi, and F. Bullo, “Iden-
tifying cyber attacks via local model information,” in
International Conference on Decision and Control -
CDC 2010, Atlanta, USA, December 2010, pp. 5961
– 5966.
[49] A. Teixeira, H. Sandberg, and K. H. Johansson, “Net-
worked control systems under cyber attacks with ap-
plications to power networks,” in Proceedings of the
2010 American Control Conference, June 2010, pp.
3690–3696.
[50] L. Negash, S. Kim, and H. Choi, “Distributed
unknown-input-observers for cyber attack detection
and isolation in formation flying uavs,” CoRR,
vol. abs/1701.06325, 2017. [Online]. Available:
http://arxiv.org/abs/1701.06325
208 Informatica 41 (2017) 193–207 M. Toulouse et al.
 Informatica 41 (2017) 209–219 209 
Individual Classification: an Ontological Fuzzy Based Approach 
Asma Djellal 
Preparatory School of Economics, Business and Management Sciences of Constantine, Algeria 
LIRE Laboratory, Constantine 2 - Abdelhamid Mehri -University, Constantine, Algeria 
E-mail: asmadjellal@gmail.com 
 
Zizette Boufaida 
LIRE Laboratory, Constantine 2 - Abdelhamid Mehri -University, Constantine, Algeria 
E-mail: zizette.boufaida@univ-constantine2.dz 
 
Keywords: fuzzy logic, fuzzy ontology, classification reasoning, individual classification, fuzzy ontologies realization 
Received: July 25, 2016 
Recently, serval reasoners for very expressive fuzzy Description Logics have been implemented. However, 
in some cases, applications do not require all the reasoner services and would benefit from the efficiency 
of just certain reasoning tasks. To this scope, we are interested in the individual fuzzy classification issue. 
In fact, decision-making applications for real world domain is often based on classifying new situations 
into fuzzy categories. Therefore, we propose Fuzzy Realizer to offer an effective classification even with 
imprecise/vague or incomplete knowledge so that appropriate decision can be made. Fuzzy Realizer is a 
Java prototype implementation for realizing fuzzy ontologies. It supports the well-known fuzzy description 
logic Z SHOIN (D). It allows (i) fuzzy concrete domains, (ii) modified and (iii) weighted concepts. It is able 
to (i) classify new individuals, even with incomplete descriptions, (ii) provide a more human-oriented 
classification by hiding the crisp boundaries between different fuzzy categories and (iii) to populate fuzzy 
ontologies which address an aspect of fuzzy ontologies evolution, a topic which is rarely discussed. 
Povzetek: Razvit je postopek za individualno klasifikacijo s pomočjo mehke logike. 
1 Introduction 
Crisp ontologies, based on first-order logic formalisms, 
are not suitable for handling imperfect knowledge. 
Knowledge imperfection, manifested by incomplete, 
vague or imprecise notions, is inherent to several real-
world domains, and this problem has therefore attracted 
the attention of many research communities [21, 22, 26, 
28, 29]. Several approaches have incorporated fuzzy logic 
into ontology languages and description logics (DLs) to 
build so-called fuzzy ontologies. Indeed, a number of 
reasoners for very expressive fuzzy DLs have been 
implemented [31], including FiRE [25], FuzzyDL [3, 6] 
and DeLorean [2]. Moreover, a number of optimization 
techniques have been proposed recently for improving 
reasoning efficiency for very expressive fuzzy DLs [5, 
24]. However, in some cases, applications do not require 
all the reasoner services and would benefit from the 
efficiency of just certain reasoning tasks. To this scope, 
we have been interested in the fuzzy ontologies realization 
issue.  
Realizing fuzzy ontologies with new individuals is a very 
important reasoning task. Using this reasoning task, 
several real world domains can benefit from affective 
decision-making applications. Indeed, in a domain like e-
health, doctors always classify their patients into fuzzy 
categories. When referring to a patient’s fever, for 
example, if we have a body temperature of 38.5°, it will 
be stated that the patient has a “high” fever. However, a 
temperature of 38° will present a “high” fever, but also it 
can be stated that it is an “average” fever. A similar 
classification can be used in industry where Industrial 
Process Control Systems collect data, such as temperature 
and pressure of gas and oil pipes, for example, to be 
classified as safe situations or not. Based on this 
classification appropriate decisions can be made.               
Classification is the main reasoning mechanism for 
systems based on class/instance models. It is one of the 
most powerful and fundamental human inference 
mechanisms. It maintains the stability of the knowledge 
base in the presence of new knowledge, by connecting 
each knowledge to its class. However, since we are 
handling imperfect knowledge, giving exact definitions of 
class boundaries seems to be a very difficult, perhaps even 
impossible, task. Therefore, we have integrated fuzzy 
logic with classification to enable the attachment of an 
individual to several fuzzy classes. Such attachment 
makes the sharp borders between classes disappear, which 
better reflects reality and allows a more human-oriented 
modelling process.  
Having these ideas in mind, we propose a fuzzy-based 
approach for realizing fuzzy ontologies by classifying new 
individuals and connecting them to their most specialized 
concepts. Based on this classification operators may take 
the appropriate decisions. With our approach, two features 
of knowledge imperfection can be handled: 
vagueness/imprecision and incompleteness. Indeed, based 
on a fuzzy classification algorithm, the proposed 
reasoning service can classify new individuals, even with 
incomplete description. To validate our ideas, we have 
210 Informatica 41 (2017) 209–219 A. Djellal et al.  
 
implemented this algorithm in what we call Fuzzy 
Realizer. It is a Java prototype implementation supporting 
the fuzzy DL SHOIN (D) under Zadeh semantics (Z SHOIN 
(D)).  It allows (i) fuzzy concrete domains, (ii) modified 
and (iii) weighted concepts.  
The underling key of Fuzzy Realizer is that (i) it can 
classify new individuals, even though we may lack 
information about them, (ii) it provides a more human-
oriented classification process by assigning an individual 
to serval fuzzy concepts with different membership 
degrees. Finally, (iii) it can populate fuzzy ontologies 
which address an aspect of fuzzy ontologies evolution, a 
topic which is rarely discussed. Indeed, ever since the 
development of ontologies, especially from large text 
corpuses, became a well-understood problem [23], 
reconstruction is always preferred to an evolutionary 
process. In fact, the evolution problem is challenging [33] 
and need to be analysed from different point of views, 
thus, the present paper addresses the individual 
classification issue by providing a realization service for 
fuzzy ontologies.  
The remainder of this paper is organized as follows. 
Section 2 presents some preliminaries that will be used in 
the rest of the paper, namely, fuzzy logic and classification 
reasoning mechanism. Section 3 reviews some related 
works and situates our work in that context.  Section 4 
discusses the proposed fuzzy realization algorithm then, 
an extension of this approach, namely a fuzzy relocation 
process will be presented in Section 5. To validate our 
ideas, we present in Section 6 Fuzzy Realizer. Finally, 
Section 7 concludes the paper with ideas for future 
research. 
2 Preliminaries   
This section describes some background material 
regarding (i) fuzzy logic and its use for representing 
imperfect knowledge, and (ii) the classification reasoning 
mechanism which enables their classification. 
2.1 Fuzzy logic and fuzzy ontology 
Fuzzy logic was designed to solve the problem of 
vague/fuzzy and imprecise knowledge representation. It 
was introduced by L. A. Zadeh in the mid-1960s as an 
extension of Boolean logic [34]. In classical set theory, 
there are two possibilities: elements either belong to a set 
or they do not. This theory does not consider many 
situations that are frequently encountered in everyday life, 
where imprecision is manifested by terms like high, 
young, hot and the like. Fuzzy logic, based on fuzzy set 
theory, is designed to consider this kind of situation. It is 
based on the notion of partial membership, where each 
element belongs partially or gradually to defined fuzzy 
subsets.  
Definition. Let X be a set of elements. A fuzzy subset A 
of X is defined by a function called the membership 
function and is denoted as 𝝁𝑨(𝒙). 𝝁𝑨(𝒙) is a mapping 
which takes any value from the real interval [0, 1]:      
𝜇𝐴(𝑥): 𝑥 → [0, 1], 𝑥 ∈ 𝐴 
The crisp set operators negation, intersection and union 
are extended to fuzzy subsets and performed by fuzzy 
negation, t-norm and s-norm functions, respectively, so 
that one can form different fuzzy logics. The most widely 
used one is Zadeh fuzzy logic, known as Zadeh Semantics 
[4]. It is a combination of Gödel conjunction (tG) and 
disjunction (SG) (tG = min (a, b) and SG = max (a, b)) and 
Łukasiewicz negation (NL) (NL = 1 – a).  
Fuzzy calculus is a vast and very flexible research field; 
indeed, it is used in many domains, one of them is fuzzy 
ontologies development [5, 1, 14]. Fuzzy ontologies 
extend crisp ones by interpreting concepts and roles as 
fuzzy sets of individuals and binary relations respectively. 
Unlike crisp ontologies which allow an element to be 
described or not, {0, 1}, by each concept in the ontology, 
fuzzy ontologies associate an element to each concept 
using a membership degree in the interval [0, 1]. Such 
association allows the attachment of each element to 
different concepts with different membership degrees. 
Consequently, fuzzy ontologies have a more flexible 
representation capability than crisp ones. In fact, vague 
notions, manifested by fuzzy terms like high_temperature, 
very_close_to and the like, are quite comment in human 
language, and they can be represented by means of fuzzy 
ontologies elements using different constructs [29]; the 
most important of these are:  
Explicit fuzzy concepts. Represented by means of fuzzy 
membership functions using fuzzy concrete domains such 
High_temperature which is a fuzzy concept defined with 
the fuzzy concrete domain High with its Right-Shoulder 
membership function, High (37, 38.5) as: 
High_temperature ≡ temperature ⨅  Degree.High 
Modified concepts. Fuzzy modifiers, such as very or 
slightly, are defined by functions fm: [0, 1] [0, 1], applied 
to change membership functions. For instance, 
Very_high_temperature is a fuzzy modified concept 
defined with Very as a fuzzy modifier having the function 
fVery (x) = x2 as: 
Very_high_temperature ≡ temperature ⨅ 
 Degree.Very (High) 
Weighted concepts. Sometimes we want to express the 
importance of concepts representing preferences or 
priorities, such as 0.8 (C). These concepts, called fuzzy 
weighted concepts, are defined as follows:  
D ≡ w (C) / w ∊ [0, 1] 
For the rest of the paper, m and fm are used to represent 
fuzzy modifiers and their membership functions, while w 
(w ∊ [0, 1]) is used to express weights of concepts. 
In this section, we have provided some preliminaries 
regarding fuzzy ontologies by introducing the basic 
concepts which are involved. For a more in-depth 
presentation, we refer the reader to [30]. 
2.2 Classification reasoning mechanism 
Classification is the fundamental inference mechanism for 
object-based representations. Indeed, structuring 
knowledge into classes, subclasses and instances promotes 
the use of classification to retrieve implicit knowledge. To 
this end, classification can be used to (i) categorize a set 
of objects into category graphs, (ii) add a new category to 
Individual Classification: an Ontological... Informatica 41 (2017) 209–219 211 
an already created graph or (iii) to add a new object to its 
most specialized categories in the created graph [18]. This 
process, also called individual classification, refers to 
ontology realization. It is used to retain the stability of an 
already created knowledge base in the presence of a new 
individual by connecting it to the most specialized 
concepts it belongs to (see, Figure 1).  
Classification of individuals consists of precisely selecting 
their belonging classes. Therefore, different classes have 
to be well separated. However, giving exact definitions of 
class boundaries is a very difficult, perhaps even 
impossible, task. The difficulty comes from the vagueness 
of the modelled knowledge. To address this problem, we 
have integrated fuzzy logic with classification to enable 
the use of non-numerical values which allow non-sharp 
definitions of class boundaries. Fuzzy classification [16, 
32] is the process of grouping elements into fuzzy sets. 
The membership of these elements to each fuzzy set is not 
full but partial to some degree. The main difference 
between crisp and fuzzy classification is that in fuzzy 
classification, an element can belong to several fuzzy 
classes with different membership degrees. Such 
membership makes the sharp borders between classes 
disappear, which better reflects reality and allows a more 
human-oriented modelling process.  
Figure 1: An individual classification example. 
3 Related work 
Work related to our research context explores two 
research fields: (i) handling imperfect knowledge and (ii) 
classification reasoning mechanisms.  
3.1 Handling Imperfect knowledge  
It has been widely pointed out that crisp ontologies are not 
suitable to handle imperfect knowledge. Thus, many fuzzy 
approaches have been proposed to cover this limitation [1, 
3, 4, 7, 13, 27]. As a result, a few methodology for 
developing fuzzy ontologies have been proposed and a 
number of fuzzy extensions of DL have been used. 
However, like crisp ontologies, the success of fuzzy ones 
depends on the availability of effective software allowing 
their exploitation. Consequently, the reasoning task has 
been a very interesting topic for many researchers. 
DeLorean (DEscription LOgic REasoner with 
vAgueNess) [2] was the first reasoner that supported a 
fuzzy extension of the DL SROIQ. As far as we know, 
DeLorean is the only reasoner that supports fuzzy OWL2. 
Based on Zadeh Semantics, it represents fuzzy operators 
and reduces the resulting fuzzy Z SROIQ knowledge base 
to a crisp one by creating new crisp concepts and roles 
representing α-cuts [20] of original fuzzy ones. Other quite 
similar studies have proposed reasoners for expressive 
fuzzy DLs. For instance, Fire implements a tableau 
algorithm for fuzzy SHIN restricted to Zadeh Semantics 
[25]. YADLR is a Prolog implementation based on linear 
programming [17]. It supports a fuzzy extension of 
ALCOQ under Łukasiewicz and Zadeh fuzzy logics and 
allows variables as degrees of truth. In order to benefit 
from the full expressivity of a less expressive language 
and then guarantee the reasoning efficiency, LiFR was 
proposed [31]. It is a lightweight fuzzy reasoner oriented 
to mobile devices and the supported language is f-DLP. It 
allows fuzzy concept assertions and weighted concepts. 
FuzzyDL [3, 6] is an important fuzzy reasoner supporting 
fuzzy extensions of SHIF (D) under Zadeh, Łukasiewicz 
and classical semantics. It was successfully used in some 
practical applications. Its interesting features are 
aggregation of fuzzy concepts, explicit fuzzy set 
membership functions and fuzzy modifiers.  
Like all these cited works, we were interested in reasoning 
with imperfect knowledge using fuzzy logic. However, 
unlike them, we have been interested in just one reasoning 
task to propose a fuzzy ontologies realization service as 
much complete and efficient as possible. As far as we 
know, no other work exploits the fuzzy classification 
mechanism with fuzzy ontologies especially with 
incomplete individuals. On the other hand, there have 
been some previous attempts to combine this reasoning 
mechanism with fuzzy logic in other research fields, such 
as pattern recognition and data mining.  
3.2 Classification reasoning mechanism 
fCQL (Fuzzy Classification Query Language) is a toolkit 
for classification, analysis and decision support applied in 
the marketing domain of a telecom company [19,  32].  
Meier et al. claimed that ‘Using linguistic terms and 
variables hides the complexity of the domain and permits 
a more intuitive and human-oriented querying process in 
different application domains’ [19, pp. 586-587]. 
Therefore, they exploited the advantages of fuzzy logic to 
reduce the business data complexity and extracts valuable 
hidden information through fuzzy classification. fCQL 
allows formulating fuzzy queries which are then 
transformed into SQL statements. This approach benefits 
from fuzzy logic in classification and querying. However, 
its main disadvantage is that it is a data oriented approach, 
thus semantic retrieval of resources is not supported.   
A closer approach to ours is [12], which defines a semi-
automated musical genre classification mechanism using 
an ontological representation. Fuzzy classification was 
used to allow the classification of music resources into 
musical genres based on a score provided by the resource 
composer expressing its viewpoint. Indeed, in music 
classification, different users are not required to agree 
about the classification of a specific music resource in the 
same musical genre. In this approach, fuzzy classification 
212 Informatica 41 (2017) 209–219 A. Djellal et al.  
 
is flexible regarding the different interpretations of music 
genres. However, the consideration of vagueness is quite 
limited because (i) music resources are represented by 
crisp ontologies, (ii) fuzzy logic is restricted to express 
users’ viewpoints and (iii) the membership degree is not 
calculated based on membership functions but is instead 
given by the user. Finally, in this approach, (iv) knowledge 
imperfection was considered without reference to 
knowledge incompleteness, which is an important feature 
of fuzzy knowledge.   
4 A fuzzy realization algorithm 
Using an illustrative example, we will study and improve 
upon a fuzzy realization algorithm proposed in a previous 
work [9]. The following algorithm has been extended and 
improved in order to accelerate the classification process.  
Table 1: Fuzzy realization algorithm  
The proposed algorithm allows the realization of fuzzy 
ontologies and results to evolved ones in which the new 
individual A will be attached to its most specialized 
concepts. First, the user provides the necessary knowledge 
(line 1) to start the classification loop. This loop consists 
of exploring the hierarchy and matching the current 
concept C* with A (line 4), starting at the hierarchy root 
TOP (line 2). The Matching procedure verifies A 
’membership in C* and, if A belongs to C*, the concept 
will be marked with a label and a membership degree. To 
accelerate the classification, the Marks_Propagation 
procedure (line 5) propagates marks to different concepts 
related to C* based on some logical rules. The next 
concept to be matched with A is chosen by the 
Next_Concept function (line 6). If there are no more 
unmarked concepts, Next_Concept returns null which 
terminate the classification.  
Illustrative example. In the following sections, we will 
study the proposed algorithm for the following illustrative 
example; it is an excerpt from a simple fuzzy knowledge 
base about persons: 
TBox 
[Ax 1] Person ⊑⊤         [Ax 2] Male ⊑⊤  
[Ax 3] Female ⊑⊤                      [Ax 4] Male ≡ ⌐ Female 
[Ax 5] Man ≡ Person ⨅ Male    
[Ax 6] Woman ≡ Person ⨅ Female     
[Ax 7] Young ≡ Person ⨅ ∃ HasAge.YoungAge  
[Ax 8] Adult ≡ Person ⨅ ∃ HasAge.AdultAge  
[Ax 9] Teacher ≡ Adult ⨅ ∃ HasFunction.Teacher  
[Ax10]VeryYoung ≡Person⨅∃HasAge.very (YoungAge)  
[FCP 1] YoungAge (x) = Left-shoulder (10, 30)  
[FCP 2] AdultAge (x) = Trapezoïdal (30, 35, 50, 60)  
[FCP 3] Very (x) = x2  
ABox 
[FCA 1] 〈Tom: Person = 1〉    [FCA 2] 〈Tom: Male = 1〉 
[FCA 3] 〈Lina: Person =1〉     [FCA 4] 〈Lina: Female = 1〉
    
Person, Male and Female are defined as atomic concepts. 
Axioms [Ax 5] and [Ax 6] define crisp concepts, while 
[Ax 7]–[Ax 10] describe some fuzzy ones. [FCP 1] and 
[FCP 2] concern fuzzy concrete predicates YoungAge and 
AdultAge; they indicate the degree to which a person is 
young or adult, respectively, using left-shoulder and 
trapezoidal membership functions. [FCP 3] defines the 
fuzzy modifier Very. Finally, the ABox contains some 
fuzzy concept assertions to define two individuals: Tom 
and Lina. 
4.1 Initialization ( ) procedure 
To start the classification loop, we need to collect some 
information about the new individual in the form of 
(attribute, value) pairs. The user must provide as much 
knowledge as possible so that the algorithm can classify 
the individual as precisely as possible in the hierarchy. If 
the user do not have enough information, the Initialization 
procedure accepts the value ‘Unknown’. Consequently, 
the proposed algorithm can classify incomplete 
individuals.  
Definition.  Let A be an individual defined by its 
description in the form of a set of (attribute, value) pairs. 
If we are missing information about an attribute of A, then 
it is incomplete. Formally: 
A is incomplete  A = {(Att1, Val1)… (Attn, Valn)} and  
 i / (Atti, Unknown) ∊ A. 
Example 1. Consider our ABox, having the individual 
Tom with its description: Tom = {(Name, Tom), (Age, 
33), (Size, 1.7), (Function, Unknown) …}. Since we are 
missing information about the attribute “Function”, Tom 
is incomplete.   
4.2 Matching (C*, A) procedure 
Matching is the algorithm key procedure. It has the role of 
checking an individual’s membership in the current 
concept based on a membership function. Classical, two-
valued, membership function has been successfully 
applied to consider complete and precise knowledge, for 
which we can exactly define their belonging classes. 
However, it seems to be inappropriate to be used for 
managing fuzzy knowledge bases, in which we handle 
imprecise and incomplete knowledge. To cover this 
limitation, we have chosen the membership function with 
three values. The scope of this function is extended to 
accept the value possible, if we do not have sufficient 
information for affirming or denying an instance’s 
membership in a given class. This function can be 
described as follows, given that x is an instance and C is a 
class with the membership function C(x): 
Algorithm1. Fuzzy realization algorithm 
Input:     H: Fuzzy concepts hierarchy (Fuzzy Ontology) 
              A: New individual 
Output:  Evolved fuzzy ontology  
1. Initialization (  ); 
2. C*:= TOP (H); 
3. While (not empty (C*)) do 
4.      Matching (C*, A); 
5.      Marks-Propagation (C*, label, degree) ; 
6.      C*:= Next-Concept (C* ); 
7. End while 
Individual Classification: an Ontological... Informatica 41 (2017) 209–219 213 
𝐶(𝑥) = {
𝑠𝑢𝑟𝑒                 𝑖𝑓 𝑥 ∈ 𝐶
𝑖𝑚𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒    𝑖𝑓 𝑥 ∉ C
𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒    𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
 
Based on this function, Matching procedure marks the 
current concept C* with a label, indicating whether it is 
sure, possible or impossible for the new individual (see 
Figure 2). Since there is no full membership in fuzzy 
ontologies, C* will be marked with another mark 
expressing the degree of this membership. In sum, 
Matching procedure generates the following output: ⟨C*, 
label, degree⟩ where label ∊ {S, P, I} and degree ∊ [0, 1], 
if the new individual belongs to C* (that is, label = sure), 
or null if there is no membership. For the rest of the paper, 
S, P and I will be used to represent, respectively, the marks 
Sure, Possible and Impossible:    
 ⟨C*, S, d⟩ (A is C* with a truth-value of d): if A’s value 
for each attribute satisfies the constraints of C*. This 
membership can be determined only if A is complete. 
 ⟨C*, I, null ⟩ (A is not C*): if A’s value for at least one 
attribute does not satisfy the constraints in C*. In this case, 
we do not consider whether A is incomplete. 
 ⟨C*, P, null ⟩ (A may be C*): if A is incomplete and its 
values do not stand in contradiction with C*. 
Figure 2: Fuzzy classification of an individual based on 
concept marking.  
Table 2: Matching (C*, A) procedure. 
If C* includes some attributes that are not defined in the 
description of A, then Matching asks the user for values 
for these attributes (lines 2–4). Using the function 
Get_degree, the membership of A in C* is computed (line 
5). If there is no membership (Degree = 0), the matching 
stops and C* will be marked Impossible (lines 12–14). If 
all constraints are satisfied (that is, Degree > 0), two cases 
are considered:  
 A is incomplete: the matching stops and C* will be 
marked Possible (lines 6–8).  
 A is complete: C* will be marked Sure to some 
'Degree' (lines 9–11).  
In order to mark C*, the function Get_degree calculates 
A’s degree of membership in C*. C* can be described 
based on several logical expressions: concept conjunction, 
modified concept, explicit fuzzy concept etc. Based on the 
description of C* and under Zadeh Semantics, the 
Get_degree function proceeds according to the following 
cases:  
 Concept conjunction C* ≡ C1⨅ …⨅Cn :  
Degree (C*, A) = min (Degree (Ci, A))/ i=1..n. 
 Concept disjunction C* ≡ C1⨆ …⨆Cn : 
Degree (C*, A) = max (Degree (Ci, A)) / i=1..n. 
 Concept negation C* ≡  C: 
Degree (C*, A) = 1- Degree (C, A) 
 Fuzzy modified concept C* ≡ m(C): 
Degree (C*, A) = fm ( Degree(C, A)). 
 Fuzzy weighted concept C* ≡ w (C):  
Degree (C*, A) = w * Degree(C, A) 
 Explicit fuzzy concept C* ≡  Attribute.Range, where 
Range is a fuzzy predicate:  
Degree (C*, A) =  fRange(A.Attribute).  
eg.  Age.YoungAge, results to  fYoung (A.Age) 
 Limited existential quantification: C* ≡ R.C. In this 
case, the function returns the maximum degree of 
mumbership of all individuals (Ai) related by the role R 
in the concept C.  
Degree(C*, A)= max (Degree(C, Ai)). 
 Value restriction: C* ≡ ∀ R.C. The function returns the 
minimum degree of mumbership of all individuals (Ai) 
related by the role R in the concept C:  
Degree (C*, A) =  min (Degree(C, Ai)). 
 Max cardinality C* ≡ ≥ n R.C:  
If | Degree (C, Ai) > 0 | >= n then Degree (C*, A) = 1 
else 0. 
 Min cardinality C* ≡ ≤ n R.C:  
If | Degree (C, Ai) > 0 | <= n then Degree (C*, A) = 1 
else 0. 
Example 2. Assuming that the individual Lina in our 
earlier illustrative example is a 12 years old girl. Since 
Person is already marked sure (⟨Person, S, 1⟩). Based on 
[Ax 7] and [CPF 1], Get_degree (Young, Lina) returns 0.9, 
Procedure1. Matching (C*, A)  
Input:    C*: Current concept  
             A={(Att1,Val1),…,(Attn,Valn)}:new individual 
Output: ⟨C*, label, degree⟩  
Degree : real; 
1 Begin 
2 If (  Vali = "") then 
3 Request the user; 
4 End if 
5 Degree := Get_degree (C*, A); 
6  If (Degree > 0) then 
7   If (∃ Vali = " Unknown") then 
8         Mark (C*, P, null); 
9  Else 
10         Mark (C*, S, Degree);         
11   End if 
12 Else  
13    Mark (C*, I, null); 
14 End if 
15 End. 
214 Informatica 41 (2017) 209–219 A. Djellal et al.  
 
and thus, the fuzzy concept Young will be marked ⟨Young, 
S, 0.9⟩.  
Example 3. Consider the individual Tom in Example 1. 
Based on its description, [Ax 8] and [FCP 2] and 
Get_degree (Adult, Tom) = 0.6, Matching will mark this 
concept as ⟨Adult, S, 0.6⟩. Moreover, based on [Ax 9], 
Teacher will be marked as ⟨Teacher, P, null⟩.   
4.3 Marks-propagation (C*, label, degree) 
procedure 
In order to accelerate the classification process, Marks-
propagation minimizes the number of concepts to be 
verified by Matching procedure. It is a recursive procedure 
that propagates marks to concepts related to C*, and to 
their related concepts. This procedure propagates marks 
based on the mark of C* according to certain rules and 
under Zadeh Semantics. For instance, according to [R.1], 
all synonymous of C* will be marked sure to some degree 
(d). Then, each of these synonymous (D) will be the new 
input of Marks-propagation, and then this procedure starts 
to propagate marks to all concepts related to it (D), and so 
on until there will be no rule to be applied or no unmarked 
related concept to be marked.   
[R.1]  If ⟨C*, S, d⟩, then D, C*  D ⟨D, S, d⟩. 
[R.2]  If ⟨C*, I, null⟩, then D, C*  D ⟨D, I, null⟩.  
[R.3]  If ⟨C*, S, d⟩, D, C*   D ⟨D, I, null⟩.  
[R.4]  If ⟨C*, I, null⟩), then D, C*   D ⟨D, S, d⟩ / d = 
Get_degree (D, A).  
In this case, we can confirm the membership of A in D. 
However, the degree of this membership must be 
computed by Get_degree (D, A).   
[R.5]  If ⟨C*, S, d ⟩, then D, C* ⊑ D, ⟨D, S, ≥ d ⟩. 
[R.6]  If ⟨C*, I, null⟩, D,D⊑C*,⟨D, I, null⟩.  
[R.7]  If ⟨C*, P, null⟩, then D, D ⊑ C*, ⟨D, label, null⟩ 
/ label  {S}.  
This rule can be used to check some consistency problems. 
Indeed, if C* is possible for A, then A is incomplete for 
C* and for all of its specific concepts.  
[R.8]  If ⟨C*, S, d⟩, thenD, D≡ m(C*), ⟨D, S, fm (d) ⟩.   
[R.9]  If ⟨C*, S, d⟩, D, D ≡ w(C*), then ⟨D, S, w*d⟩. 
Supposition1. D is defined by a concept conjunction 
including C* as D ≡ C* ⨅ C1⨅ … ⨅ Cn. 
[R.10]  If ⟨C*,I, null⟩, then ⟨D, I, null⟩.  
[R.11]  If ⟨Ci, S, di⟩, and ⟨C*, S, d⟩, then ⟨D, S, deg⟩ / deg 
= min (d, di) / i=1... n.  
[R.12]  If ⟨D, I, null⟩, ⟨C*, S, di⟩ and   j ∊ {1...n}          
⟨Cj, "", ""⟩ (which means that Cj is unmarked),       
 i ∊ {1...n} / i ≠ j ⟨Ci, S, di⟩, then ⟨Cj, I, null⟩. 
Supposition2. D is defined as a concept disjunction 
including C* as  D ≡ C* ⨆ C1 ⨆ … ⨆ Cn. 
[R.13]  If ⟨C*, S, d⟩, then ⟨D, S, deg⟩ / deg = max (d, di) 
/ i = 1…n.  
[R.14]  If ⟨Ci, I, null⟩/i = 1…n and ⟨C*, I, null⟩, then     
⟨D, I, null⟩. 
[R.15]  If ⟨D, S, d⟩, ⟨C*, I, null⟩ and  j ∊ {1...n}             
⟨Cj, "", ""⟩,  i ∊ {1...n} / i ≠ j ⟨Ci, I, null⟩, then 
⟨Cj, S, d⟩. 
Example 4. Recall the individual Tom from our 
illustrative example. Since Person and Male are already 
marked Sure ([FCA 1] and [FCA 2]), based on [Ax 5] and 
applying [R. 11], Marks-Propagation can propagate the 
mark Sure to the concept Man as: ⟨Man, S, 1⟩. It can also 
propagate the mark Impossible to Female based on [Ax 4] 
and applying [R. 3]. Moreover, Woman will be impossible 
for Tom based on [Ax 6] and applying [R. 10]. 
Example 5. During the classification of Lina in Example 
2, we have generated the result ⟨Young, S, 0.9⟩. Thus, 
based on [Ax 11] and using [R. 8], Marks-Propagation can 
mark the modified concept VeryYoung as ⟨VeryYoung, S, 
0.81⟩. Consider same concepts and the individual Tom. If 
Matching (Young, Tom) results to ⟨Young, I, null⟩, then 
the same mark will be propagated to VeryYoung as 
⟨VeryYoung, I, null⟩.    
4.4 Next-Concept (C*) function  
The aim of this function is to select a new unmarked 
concept to be the next current concept, by traversing the 
hierarchy of fuzzy concepts. We use the breadth-first 
search traversal, which is one of the important graph 
traversal techniques, to explore the hierarchy graph. Using 
this technique, Next-Concept selects the next unmarked 
neighbouring concept of C*. After testing all the 
unmarked neighbours, the function moves to the next level 
of the hierarchy and goes from left to right to select a new 
target concept. If there are no more unmarked concepts, 
Next_Concept returns null.  
In our work, we were inspired by the multi-viewpoints 
classification algorithm proposed in [18], in which 
classification was used in an object-oriented multi-
viewpoints representation system named TROPES. This 
algorithm provides multi-viewpoints instance 
classifications in which an instance can be classified in 
one or more viewpoints. This work was extended to 
consider individuals reclassification in multi-viewpoints 
ontologies [11]. These multi-viewpoints classification 
algorithms [11, 18] are both based on the hypothesis of the 
exclusiveness of sister classes, which assumes that classes 
at the same hierarchy level (called sister classes) represent 
mutually exclusive sets. Therefore, an individual which 
belongs to a class cannot belong to any of its sister classes. 
Unlike the cited works, our algorithm is not based on this 
hypothesis. Indeed, in our fuzzy ontology 
conceptualization, fuzzy concepts are modelled as fuzzy 
subsets [8]. The strength of fuzzy logic in knowledge 
representation lies in the intersections between fuzzy 
subsets, as an element can belong to several fuzzy subsets 
with different membership degrees. Consequently, the 
main advantage of fuzzy classification compared to 
classical one is that an element is not limited to a single 
class but can be assigned to several sister classes which 
better reflect reality.  
Individual Classification: an Ontological... Informatica 41 (2017) 209–219 215 
Example 6. Consider the two fuzzy concepts Child and 
Teenager, defined by their trapezoidal membership 
functions (See Figure 3). Having the little girl Lina of 
Example 2, we can calculate these memberships: Child 
(Lina) = 0.66 and Teenager (Lina) = 0.33. These results 
dedicate that Lina is considered a Child but also a 
Teenager, with different membership degrees. 
Figure 3: Assignment of an individual to different fuzzy 
concepts. 
5 Individual relocation: an 
extension of the fuzzy realization 
approach   
The proposed algorithm provides a complete and efficient 
realization service for fuzzy ontologies. Indeed, it can 
efficiently classify individuals, even incomplete ones, in 
their appropriate belonging concepts with their 
membership degrees. With this, we can ensure an 
evolutionary aspect of fuzzy ontologies by realizing them 
with new individuals. 
After their classification, individuals may evolve and 
update their knowledge. Indeed, a person changes age, 
address or professions. Therefore, a relocation process is 
necessary to evolve fuzzy ontologies. To this scope, our 
proposed algorithm is extensible. In fact, an extension 
process of the fuzzy realization algorithm may consider 
another aspect of fuzzy ontologies evolution in which 
already classified, but updated, individuals can be 
relocated. This process allows an individual to migrate 
from its current belonging concepts to new ones that 
satisfy its updated description [10]. Changes of an 
individual description may be the result of an: 
 Enrichment of an incomplete individual by replacing 
its unknown value by a concrete one, 
 Modification of a concrete value by a new one, or  
 Impoverishment and removal of a concrete value and 
replacing it by an unknown one.  
In the first two cases, we have to handle a new data. This 
data can satisfy the fuzzy ontology constraints, and then, 
results to a consistent fuzzy ontology. It can also be in 
contradiction with some constraints and then generates an 
inconsistency: 
Fuzzy Ontology in a consistent state.  In this case, the 
individual belonging concepts must keep their marks as 
⟨Ci*, S, di⟩. However, the individual new description may 
allow it to migrate to concepts that are more specific. 
Thus, for this first case, a simple realization process is 
                                                          
1 As part of a masters’ project [15]. 
revived to descent the evolved individual in the hierarchy 
starting at its belonging concepts.   
Fuzzy Ontology in inconsistent state. To deal with this 
inconsistency, a fuzzy relocation process is invoked to 
migrate the updated individual to its new belonging 
concepts. To this end, the individual is raised up in the 
fuzzy hierarchy, by following the path of its sure super-
concepts until the first super-concept for which the new 
data satisfies its constraints. It should be noted that all 
super-concepts along the individual path (excepting the 
last one) must change theirs marks from ⟨Ci*, S, di⟩ to 
⟨Ci*, I, null⟩. To complete the individual relocation, the 
updated individual must descent in fuzzy hierarchy until it 
reaches its new belonging concepts. Indeed, the individual 
updated description can satisfy other concepts that are 
more specific than the first consistent super-concept. 
Thus, starting at this concept the fuzzy realization process 
is evoked. 
The individual knowledge can evolve to an unknown 
value. This impoverishment will not affect the ontology 
consistency. However, the evolved individual becomes 
incomplete since the concrete value of the updated 
attribute has been changed with an unknown one. To 
handle this change, the fuzzy ontology must evolve and 
the updated individual must raise up, by following the path 
of its super-concepts until the first sure super-concept in 
which there is no specification for the impoverished 
attribute. All these super-concepts (excepting the last one) 
must change their marks from ⟨Ci*, S, di⟩ to ⟨Ci*, P, null⟩. 
Unlike the enrichment/modification, in the case of an 
individual impoverishment, once the ascent to the first 
sure super-concept is done no further descent is possible. 
Indeed, there is no concrete new data to be matched with 
more specific concepts.  
6 Validation of the proposed 
algorithm  
In order to validate our ideas, we have implemented1 the 
proposed fuzzy realization algorithm as Fuzzy Realizer. It 
is a Java prototype implementation that supports a fuzzy 
extension of the well-known DL Z SHOIN (D). Fuzzy 
Realizer has a graphical interface for displaying the fuzzy 
ontology in the form of a coloured directed acyclic graph 
(DAG), in order to improve the results presentation and 
thereby facilitate the decision-making process (see Figure 
4). Fuzzy Realizer has a modular architecture and is 
divided into three modules: the Parser, Visualization and 
Classification modules. The Parser translates the fuzzy 
ontology into an internal format, so that any fuzzy 
ontology encoded in any language (OWL, Fuzzy OWL, 
OWL 2, ...) can be used. The Visualization module 
displays the loaded ontology hierarchy in the form of a 
DAG. Finally, Classification, the proposed system’s key 
module, calculates the new individual’s membership in 
different (fuzzy) concepts. Once it is attached to its 
belonging concepts, the Visualization module displays the 
concept’s marks on the created DAG. Each mark is 
represented by a colour, which produces a coloured DAG. 
216 Informatica 41 (2017) 209–219 A. Djellal et al.  
 
The colours red, orange and green are used to represent, 
respectively, impossible, possible and sure concepts.  
 
Figure 4: Fuzzy Realizer interface.
In order to better facilitate the decision-making process, 
membership degrees are represented by numerical values 
on the coloured graph nodes and also by gradations of the 
colour green, ranging from light green, which represents a 
low membership, to dark green, which represents full 
membership (see Figure 5).  
In order to evaluate the proposed system’s performance, 
we carried out a range of experiments with different fuzzy 
ontologies, beginning with a simple Medical Checkup 
Fuzzy Ontology (MCFO) and then using more highly 
expressive and voluminous fuzzy ontologies (Fuzzy 
Wine2, Matchmaking3, Multi-criteria decision making4). 
We present the results for two of these ontologies in the 
following subsections. We also compared our Fuzzy 
Realizer and the well-known fuzzy reasoner FuzzyDL [6, 
9]; this was done by replacing the Classification module 
with the fuzzy reasoner FuzzyDL. 
6.1 Medical Checkup Fuzzy Ontology 
(MCFO) 
 Uncertainty is the central critical fact about reasoning in 
e-health domain. Usually, doctors cannot give exact 
diagnoses and laboratories cannot report exact analysis 
results. Despite this uncertainty, doctors have to make 
decisions. In order to implement a decision-making 
                                                          
2http://users.abo.fi/rowikstr/FuzzyWineOntology/FuzzyWineOntology.
owl 
3http://www.umbertostraccia.it/cs/software/FuzzyOWL/ontologies/m
atchmaking.owl 
process using medical check-up fuzzy knowledge, we 
developed MCFO using the Fuzzy DL SIQ (D) and Fuzzy 
OWL. Then, using Fuzzy Realizer, we realized it with new 
individuals. Table 3 represents the description of the new 
(incomplete) individual Tim.  
Table 3: description of the new individual Tim. 
Although it is an incomplete individual, Fuzzy Realizer 
was able to classify it as low as possible in the hierarchy 
(see Figure 5) by providing the set of its sure (to some 
degree) and possible fuzzy concepts. This classification 
cannot be done using FuzzyDL since it does not offer a 
service for classifying incomplete individuals.  
4http://www.umbertostraccia.it/cs/software/FuzzyOWL/ontologies/m
ultiCriteria.owl 
Attribute  Value  
Body Temperature 37.45° 
Blood Sugar 1.0 g/l 
Body Mass Index 26.0 kg/m² 
Heart Pulse Unknown  
Respiratory Rate Unknown 
Diastolic Blood Pressure 70.0 mmHg 
Systolic Blood Pressure 100.0 mmHg 
Calcium Level 2.3 mmol/l 
Individual Classification: an Ontological... Informatica 41 (2017) 209–219 217 
Figure 5: Zoom of the Classification of the incomplete individual Tim
6.2 Fuzzy Wine ontology 
The fuzzy extension of the well-known and highly 
expressive Wine ontology supporting the DL SHOIN (D) is 
the most voluminous open source fuzzy ontology. Thus, 
we used this ontology in order to test our proposed 
system’s performance. Despite its large size, we have been 
able to realize it with new individuals using our prototype. 
Figure 6 shows the classification of the new individual 
ChateauDeMeursauCru2007, described in Table 4, which 
is considered to be a HighUWSWine to degree 0.1 and a 
fully (degree = 1) HighPriceWine and TableWine. 
Figure 6: Realizing Fuzzy Wine. 
Attribute  Value  
Price 38.6 
PH 3.42 
Acidity 5.8 
Sugar 1.7 
UWSScore 89.0 
Flavor ModerateWineFlavor 
Maker ChateauDeMeursaultWinery 
Table 4: description of the new individual 
ChateauDeMeursauCru2007. 
 
218 Informatica 41 (2017) 209–219 A. Djellal et al.  
 
6.3 Discussion  
Although Fuzzy Realizer is considered to be a simple 
prototype providing a realization service for fuzzy 
ontologies, several series of tests show that it offers an 
efficient realization service since it results to correct 
classifications (all results are verified by domain experts). 
Moreover, it is capable of realizing any fuzzy ontology 
without any constraint on the represented knowledge’s 
imperfection. It is also able to realize highly expressive 
fuzzy ontologies even with incomplete individuals. 
Indeed, it was used to realize the most voluminous open 
source fuzzy ontology (Fuzzy Wine). 
Figure 7: Response time of Fuzzy Realizer modules. 
More importantly, its response time is within the limits of 
acceptability compared to the well-known fuzzy reasoner 
FuzzyDL as shown in Figure 7. All of these characteristics 
allow the proposed prototype to be tested in a real 
application and to handle real world knowledge. In sum, 
despite its simplicity, Fuzzy Realizer can be considered as 
an optimal solution for realizing fuzzy ontologies. In 
contrast, FuzzyDL is one of the most expressive and 
important fuzzy reasoners. However, its long runtime 
compared with Fuzzy Realizer and its inability to classify 
individuals in case we may lake information, which is a 
quite common problem, are weaknesses which cannot be 
ignored. 
7 Conclusion 
In this paper, we have proposed a fuzzy-based approach 
for reasoning with imperfect ontological knowledge. As a 
reasoning mechanism, we have integrated fuzzy logic with 
the most powerful human reasoning activity, known as 
classification. Using fuzzy classification, we have 
proposed Fuzzy Realizer, a java prototype for classifying 
new individuals into fuzzy ontologies. It allows (i) fuzzy 
concrete domains, (ii) modified and (iii) weighted 
concepts. We have been interested in just one reasoning 
task to address an aspect of fuzzy ontologies evolution 
namely, the realization issue. The proposed prototype can 
realize fuzzy ontologies even with incomplete individuals. 
In addition, it offers a more human-oriented classification 
by assigning an individual to several fuzzy sister classes 
which hides the sharp boundaries between them.  
As future work, we would like to extend the proposed 
prototype so that it will not be limited to Zadeh semantics, 
but to be more flexible by supporting more fuzzy logics, 
for instance Łukasiewicz, Gödel or Product logics. We are 
also intended to implement the relocation process 
extension so that we can test and evaluate the proposed 
idea. Finally, in order to improve Fuzzy Realizer’s 
performance, we would like to minimize the use of the 
mark ‘possible’. To that end, we intend to propose a new 
conceptualization of concepts by dividing each concept’s 
set of attributes into two groups: key attributes and 
auxiliary ones. During the classification of an incomplete 
individual and if the ‘unknown’ attribute is an auxiliary 
one, then the current concept can be marked ‘sure’. For 
example, if Tom has obtained a medical diploma, then 
even though we may lack information about his age, his 
address or even his last name, we can be sure that he is a 
doctor. Therefore, in order to mark the concept Doctor as 
‘sure’, it is not necessary to have known values for all 
attributes.   
Acknowledgements 
We would like to thank the anonymous referees for their 
valuable comments on an earlier version of this paper. 
References 
[1]  Alexopoulos, P., Wallace, M., Kafentzis, K. & 
Askounis, D. (2012) 'IKARUS-Onto: a methodology 
to develop fuzzy ontologies from crisp ones', 
Knowledge and Information Systems, Vol. 32 No. 3, 
pp. 667-695. 
[2]  Bobillo, F., Delgado, M. & Gómez-Romero, J. 
(2013) 'Reasoning in Fuzzy OWL 2 with DeLorean'. 
In Bobillo, F. et al. (Eds.), Uncertainty Reasoning for 
the Semantic Web II, Vol. 7123 of Lecture Notes in 
Computer Science, Springer-Verlag, pp. 119-138. 
[3]  Bobillo, F. & Straccia, U. (2008) 'fuzzyDL: an 
expressive fuzzy description logic reasoner'. In 
Proceedings of the 17th IEEE International 
Conference on Fuzzy Systems (FUZZ-IEEE 2008), 
IEEE Computer Society Press, pp. 923-930. 
[4]  Bobillo, F. & Straccia, U. (2011) Fuzzy ontology 
representation using OWL 2. International Journal of 
Approximate Reasoning, Vol. 52 No. 7, pp. 1073-
1094. 
[5]  Bobillo, F. & Straccia, U.(2013) General concept 
inclusion absorptions for fuzzy description logics: A 
First step. In Description Logics. pp. 513-525.  
[6]  Bobillo, F., & Straccia, U. (2016). The fuzzy 
ontology reasoner fuzzyDL. Knowledge-Based 
Systems, 95, 12-34. 
[7]  Calegari, S. & Ciucci, D. (2007) 'Fuzzy ontology, 
fuzzy description logics and fuzzy-owl'. In 
Proceedings of the 7th International Workshop on 
Fuzzy Logic and Applications (WILF 2007), volume 
4578 of Lecture Notes in Computer Science, 
Springer Verlag, pp. 118–126. 
[8]  Djellal, A., & Boufaida, Z. (2012)'Conceptualisation 
d’une Ontologie Floue'.In Proceedings of 9eme 
Individual Classification: an Ontological... Informatica 41 (2017) 209–219 219 
Colloque sur l’Optimisation et les Systèmes 
d'Information, Tlemcen, Algeria, pp. 62-73.  
[9]  Djellal, A., & Boufaida, Z. (2014) 'Fuzzy ontology 
evolution: classification of a new individual', Journal 
of Emerging Technologies in Web Intelligence, Vol. 
6 No.1, pp. 9-14.  
[10]  Djellal, A., & Boufaida, Z. (2016) Individual 
Relocation: A Fuzzy Classification Based Approach. 
In Model and Data Engineering: 6th International 
Conference, MEDI 2016, Almería, Spain, September 
21-23, 2016, Proceedings (Vol. 9893, p. 209). 
Springer. 
[11]  Djezzar, M., Hemam, M. & Boufaida, Z. (2012) 
'Ontological re-classification of individuals: a multi-
viewpoints approach'. In Proceedings of the 2nd 
international conference on Model and Data 
Engineering. LNCS 7602 Springer, pp. 91-102, 
Poitiers, France. 
[12]  Ferrara, A., Ludovico, L.A., Montanelli, S., Castano, 
S. & Haus, G. (2006) 'A semantic web ontology for 
context-based classification and retrieval of music 
resources', ACM Transactions on Multimedia 
Computing, Communications, and Applications 
(TOMM), Vol. 2 No. 3, pp. 177-198. 
[13]  Gao, M. & Liu, C. (2005) 'Extending OWL by fuzzy 
description logic'. In Proceedings of 17th IEEE 
International Conference on Tools with Artificial 
Intelligence: (ICTAI 05), IEEE Computer Society, 
pp. 562-567. 
[14]  Ghorbel, H., Bahri, A. & Bouaziz, R. (2009) 'Fuzzy 
Protégé for fuzzy ontology models'. In Proceedings 
of 11th International Protégé Conference 
(IPC’2009), Academic Medical Center, University 
of Amsterdam, Amsterdam, Netherlands. 
[15]  Hecham, A. & Iaiche, I. E. (2015) 'Système de 
classification et de visualisation d'instances dans une 
ontologie floue'. Masters thesis, Constantine 2- 
Abdelhamid Mehri University, Constantine, Algeria.    
[16]  Kaufmann, M., & Meier, A. (2009, June). An 
inductive fuzzy classification approach applied to 
individual marketing. In Fuzzy Information 
Processing Society. NAFIPS 2009. Annual Meeting 
of the North American (pp. 1-6). IEEE. 
[17]  Konstantopoulos, S., & Charalambidis, A. (2010) 
Formulating description logic learning as an 
Inductive Logic Programming task. In proceedings 
of the 19th IEEE International Conference on Fuzzy 
Systems, IEEE Press. 
[18]  Mariño, O. (1993) 'Raisonnement classificatoire 
dans une représentation à objets multi-points de 
vue'. PhD thesis, Joseph-Fourier-Grenoble I 
University, France. 
[19]  Meier, A., Schindler, G. & Werro, N. (2008) 'Fuzzy 
classification on relational databases'. In J. Galindo 
(Ed.), Handbook of Research on Fuzzy Information 
Processing in Databases, Vol. 2, Idea Group 
Publishing, Hershey, PA, pp. 586-614. 
[20]  Palash, D., Hrishikesh, B. & Tazid, A. (2011). 'Fuzzy 
arithmetic with and without using α-cut method: a 
comparative study'. In International Journal of Latest 
Trends in Computing, Vol. 2 No. 1, pp. 99-107. 
[21]  Pérez, I. J., Wikström, R., Mezei, J., Carlsson, C., & 
Herrera-Viedma, E. (2013). A new consensus model 
for group decision making using fuzzy 
ontology. Soft Computing, 17(9), 1617-1627. 
[22]  Rodríguez, N. D., Cuéllar, M. P., Lilius, J., & Calvo-
Flores, M. D. (2014). A fuzzy ontology for semantic 
modelling and recognition of human behaviour. 
Knowledge-Based Systems, 66, 46-60. 
[23]  Scharrenbach, T. & Bernstein, A. (2009) 'On the 
evolution of ontologies using probabilistic 
description logics'. In Proceedings of the First 
ESWC Workshop on Inductive Reasoning and 
Machine Learning on the Semantic Web. 
[24]  Simou, N., Mailis, T.P., Stoilos, G., Stamou, G.B. 
(2010) Optimization techniques for fuzzy 
description logics. In Description Logics  
[25]  Stoilos, G., Simou, N., Stamou, G. & Kollias, S. 
(2006) 'Uncertainty and the semantic web: intelligent 
systems', IEEE Intelligent Systems, Vol. 21 No. 5, 
pp. 84-87. 
[26]  Stoilos, G., Stamou, G. & Pan, J.Z. (2010). 'Fuzzy 
extensions of OWL: logical properties and reduction 
to fuzzy description logics', International Journal of 
Approximate Reasoning, Vol. 51 No. 6, pp. 656-679. 
[27]  Straccia, U. (2001) 'Reasoning within fuzzy 
description logics', Journal of Artificial Intelligent 
Research (JAIR), Vol. 14, pp. 137-166. 
[28]  Straccia, U. (2012) 'Description Logics with Fuzzy 
Concrete Domains'. arXiv preprint arXiv:1207.1410. 
[29]  Straccia, U. (2013). Foundations of Fuzzy Logic and 
Semantic Web Languages. CRC Press. 
[30]  Straccia, U. (2015). 'All About Fuzzy Description 
Logics and Applications'. In Reasoning Web. Web 
Logic Rules. Springer International Publishing pp. 1-
31. 
[31]  Tsatsou, D., Dasiopoulou, S., Kompatsiaris, I., & 
Mezaris, V. (2014). LiFR: A Lightweight Fuzzy DL 
Reasoner. In The Semantic Web: ESWC 2014 
Satellite Events (pp. 263-267). Springer 
International Publishing. 
[32]  Werro, N. (2015). Fuzzy classification of online 
customers. Springer. ISBN : 3319159690 
9783319159690 
[33]  Zablith, F., Antoniou, G., d'Aquin, M., Flouris, G., 
Kondylakis, H., Motta, E., ... & Sabou, M. (2015). 
Ontology evolution: a process-centric survey. The 
Knowledge Engineering Review, 30(01), 45-75. 
[34]  Zadeh, L.A. (1965) 'Fuzzy sets', Information and 
Control, Vol. 8 No. 3, pp. 338-353. 
doi:10.1016/S0019-9958(65)90241-X. 
  
220 Informatica 41 (2017) 209–219 A. Djellal et al.  
 
 
 Informatica 41 (2017) 221–232 221
  
SK-languages as a Powerful and Flexible Semantic Formalism for the 
Systems of Cross-Lingual Intelligent Information Access 
Vladimir A. Fomichov 
School of Business Informatics, Faculty of Business and Management 
National Research University Higher School of Economics (HSE), Kirpichnaya str. 33, 105187 Moscow, Russia 
E-mail: vfomichov@hse.ru, http://www.hse.ru/eng/org/persons/67739 
 
Keywords: semantic parsing, theory of K-representations, abstract meaning representation, formal representation of 
semantic content  
Received: January 26, 2017 
 
The first starting point of this paper is the broadly accepted idea of employing, as a promising 
methodology, an artificial semantic language-intermediary for the realization of automatic cross-lingual 
intelligent information access to natural language (NL) texts on the Web. The second one is the emergence 
in computational semantics during 2013-2016 of great interest in the semantic formalism (more exactly, 
notation) called Abstract Meaning Representation (AMR). This formalism was introduced in 2013 in an 
ACL publication by a group consisting of ten researchers from UK and USA. This paper shows that much 
broader prospects for creating semantic languages-intermediaries in comparison with AMR are opened 
by the theory of K-representations (TKR), developed by V. A. Fomichov. The basic mathematical model 
of TKR describes the regularities of NL structured meanings. The mathematical essence is that this model 
introduces a system consisting of ten partial operations on conceptual structures. Initial version of this 
model was published in 1996 in Informatica (Slovenia). The second version of the model (stated in a 
monograph released by Springer in 2010) defines a class of formal languages called SK-languages 
(standard knowledge languages). It is demonstrated that SK-languages allow us to simulate all expressive 
mechanisms of AMR. The advantages in comparison with AMR are, in particular, the possibilities to 
construct semantic representations of compound infinitive constructions (expressing goals, commitments, 
etc), of compound descriptions of notions and sets, and of complex discourses and knowledge pieces. 
Povzetek: Opisani so SK-jeziki za fleksibilno med-jezikovno dostopanje. 
 
1 Introduction 
During last decade, one has been able to observe a quickly 
growing interest in the design of computer intelligent 
agents fulfilling cross-lingual information retrieval 
(CLIR) on the Web. It is a consequence of emerging a 
huge, permanently increasing number of Web-sources in 
languages being different from English. In September 
2012, a seminar on Multilingual Semantic Web (MSW) 
was organized in Germany in the Dagstuhl Castle. The 
proceedings of this seminar contain the following data [5]: 
in the year 2010, the number of non-English-speaking 
Internet users was three times higher than the number of 
English-speaking users (1430 million vs. 536 million 
users). That is why the problem of developing an MSW is 
very topical [24-26, 35, 49, 56]. 
     It is broadly accepted that a promising approach to the 
realization of CLIR on the Web is employing a special 
semantic language-intermediary (SLI) in order to 
represent in the same format both semantic content of a 
user query and semantic content of the analysed fragment 
of a text in natural language (NL) [4, 7, 13-20, 24-26, 30, 
32, 46, 49, 51, 52, 56]. 
     The problem of creating a broadly applicable and 
flexible SLI goes far beyond the scope of CLIR. During 
last decade, the semantic parsing branch of computational 
linguistics has been considerably strengthened and 
expanded [36]. The main objective of this branch is to 
develop and implement the algorithms extracting 
meanings from NL-texts and forwarding them to 
pragmatic subsystems of applied intelligent systems. The 
real resurrection  of semantic parsing branch (after two 
decades when statistics-oriented approaches to NL 
processing dominated) has been caused, first of all, by the 
stormy progress in designing autonomous intelligent 
agents (robots) and various mobile devices (cell 
telephones, planchettes, etc.) [36, 44, 45]. Another reason 
is the problem of understanding Web-sources in many 
natural languages on requests of the end users or on 
requests of computer intelligent agents. The use of SLI is 
also reasonable in full text question-answering systems 
and in NL interfaces (in particular, to robots and mobile 
devices) even in case of the texts in one language. 
There is one more circumstance showing high 
topicality of developing broadly applicable and flexible 
SLIs. During last decade, several IT-companies have 
emerged in different countries whose principal objective 
is to combine the informational technologies of Semantic 
222 Informatica 41 (2017) 221–232 V.A. Fomichov  
 
Web and NL processing. In particular, these are Ontos 
GmbH in Swizerland [40, 53] and Cambridge Semantics 
Inc., The Smart Data Company in Boston, MA, USA [6]. 
        During last decade, many scholars have seen a 
reasonable way of creating preconditions of understanding 
NL-texts by computer systems in developing special 
linguistic databases containing sentences associated with 
manually constructed semantic representations (SRs); in 
other terms, associated with semantic annotations. Since 
the year 2013, numerous papers have been published 
devoted to employing the notation called Abstract 
Meaning Representation (AMR) for constructing semantic 
annotations of NL sentences, in particular, of sentences in 
English, Czech, and Chinese [1, 2, 35, 43, 47, 54, 55]. 
The aim of this paper is to attract the attention of the 
researchers in computational semantics to the fact that 
there is a formal theory opening much broader prospects 
for building SRs of NL sentences and discourses in 
comparison with AMR. It is the theory of K-
representations (knowledge representations) - an original 
theory of designing semantic parsers of NL-texts with the 
broad use of formal means for representing input, 
intermediary, and output data of the algorithms. Besides, 
it enriches the logical-informational foundations of MSW, 
multi-agent systems, E-commerce, knowledge 
representation in advanced ontologies, and knowledge 
representation in multi-media databases. The monographs 
[21, 25] state two versions of the theory of K-
representations (TKR). It is an expansion of the theory of 
K-calculuses and K-languages (the KCL-theory). The 
basic ideas and results of TKR are set forth in numerous 
publications both in Russian and English, in particular, in 
[12-30]. TKR is the kernel of Integral Formal Semantics 
of NL, its basic principles and composition are stated in 
[16] and in Chapter 2 of [25]. 
The structure of this paper is as follows. Section 2 
analyses related approaches, the main attention is paid to 
Semantic Role Labeling, Frame-Semantic Parsing, and 
Abstract Meaning Representation. Section 3 contains a 
task statement. Section 4 shortly describes the expressive 
mechanisms of SK-languages, introduced by TKR. 
Section 5 sets forth principal distinguished features of the 
algorithms of semantic parsing proposed by TKR.  Section 
6 shortly indicates the computer applications of TKR. 
Section 7 outlines the prospects of using SK-languages in 
the development of MSW. Section 8 concludes the paper. 
2 Related approaches 
2.1 Semantic role labeling branch of 
computational semantics 
The goal of extracting meaning from NL-texts (and 
constructing its complete or partial representation) 
emerged in many application domains in early 2000s and 
initiated a number of research projects throughout the 
world. The main stream in this field includes, in particular,  
the interrelated branches called Semantic Role Labeling 
(SRL) and Frame-Semantic Parsing (FSP). The principal 
task considered in SRL is to find semantic relations (called 
semantic roles) between the verbal forms (and some other 
predicate words) and the dependent word groups. For 
instance, it is possible to find semantic roles Agent, 
Phenomenon, and Time in the sentence “The Russian 
Nobel laureate Ivan Pavlov discovered conditional 
reflexes in the beginning of the XXth century”. 
The aim of SRL is, firstly, to find realized semantic 
roles and, secondly, to construct a formal expression 
called semantic representation in order to process it in the 
context of a discussed situation and an ontology. The 
fundamental problem of SRL is that in early 2000s one felt 
the lack of formal means allowing for reflecting semantic 
structure of arbitrary sentences. 
Example. Let S1 = “Yesterday Robert heard that the 
firm “Rainbow” would move to Manchester”, S2 = 
“Robert decided to leave the firm “Rainbow”. 
Regretfully, the field SRL as far as five years ago 
didn’t possess effective formal means for building SRs of 
sentences with complex direct or indirect speech, with 
infinitive constructions, and with modalities. In particular, 
it applies to the sentences S1 and S2. 
A significant binary event in the development of this 
branch was the publication of the pioneer work [31] on a 
computer program for statistical SRL and the creation of 
PropBank annotations depositary  [33]. These two 
publications became the starting point for designing a 
number of applied computer systems aimed at finding 
predicate-argument structures reflecting semantics of 
sentences and short discourses.  
The PropBank annotations consist of phrase-structure 
syntax trees from the Wall Street Journal section of Penn 
Treebank [38] complemented by predicate-argument 
structures for the verbs. The PropBank uses core roles 
ARG0 through ARG5, and these roles have different 
interpretations for different predicates. There are many 
studies aimed at SRL and using PropBank conventions [3, 
39, 42]. The problem with using predicate-argument 
structures is that the roles ARG2 – ARG5 serve many 
different purposes for different verbs [58]. 
A way out is provided by the branch of NL processing 
(NLP) called Frame-Semantic Parsing and closely 
connected with the branch SRL [9]. The basis of this 
branch is the linguistic resource FrameNet [10], it stores a 
significant information about lexical semantics and 
predicate-argument semantics of sentences in English. 
The FrameNet lexicon contains semantic frames, each of 
them includes a list of lexical units – associated words and 
word combinations that are able to evoke a considered 
semantic frame in an NL expression. Besides, each 
semantic frame from FrameNet indicates several roles 
corresponding to the facets of the scenario represented by 
the frame. One says that targets are the predicates (verbs, 
etc.) evoking frames and arguments are a word or a phrase 
filling a role. 
For example, the frame JUDGMENT from the 
FrameNet database contains the hand-annotated sentence 
“She blames the Government for failing to do enough to 
help”. In this sentence, the following semantic roles are 
distinguished: Judge in the pair (She, blames), Evaluee in 
the pair (blames, the Government), Reason in the pair 
(blames, for failing to do enough to help). In the FrameNet 
SK-languages as a Powerful and Flexible Semantic... Informatica 41 (2017) 221–232 223 
database, the considered sentence is represented as 
follows: 
[Judge She] blames [Evaluee the Government] 
[Reason for failing to do enough to help]. 
In comparison with PropBank, containing verbal 
predicates, FrameNet includes not only them but also 
adjectives, adverbs, and prepositions. 
2.2 Abstract meaning representation 
formalism 
In late 2000s and early 2010s, it was possible to observe a 
serious incompleteness of the field SRL. As it was 
mentioned above, the principal objective of the studies on 
SRL was to develop the methods and algorithms aimed at 
discovering semantic roles realized in the sentences. The 
purpose of discovering semantic roles is the use of this 
information in building SRs of sentences and discourses 
for interacting with pragmatic subsystems of applied 
intelligent systems. 
However, the scholars in the field SRL possessed only 
rather restricted formal tools for building SRs of 
sentences. First of all, they felt the lack of convenient 
formal means for building semantic images of compound 
objects’ and situations’ descriptions, of sentences with 
attribute clauses of purpose, of sentences with infinitive 
constructions, and of sentences expressing modalities. 
That is why in late 2000s and early 2010s the scholars 
looked for more expressive semantic formalisms. 
As a result, a new attention has been attracted to the 
semantic formalism called Abstract Meaning 
Representation (AMR), it was introduced in [34]. This 
formalism began its new life in a modified form after the 
publication of the paper [1]. 
AMR of a sentence S is an acyclic, rooted, directed 
graph with special marks of the vertices and edges. 
According to [34], a mark of a vertex has the form 
(label/concept), where label is a mark of an entity (e.g., 
label = m1) and concept is a string of the form |wd1| or 
|wd1,…, wdk|, where wd1, …, wdk are the words or word 
combinations expressing one notion (examles: |dog|, |eat, 
take in|).  
The paper [1] considers additional forms of concepts’ 
descriptions: the framesets of the linguistic database 
PropBank (“want-01”, etc.), special entity types (“world-
region”, etc.), the kinds of quantities (“distance-quantity”, 
etc.), and logical connectives “and”, “or”. 
It is possible to distinguish several main reasons for 
explaining the quickly increasing interest in AMR. 
Reason 1. The possibility to explicitly indicate 
semantic roles in the descriptions of events. It should be 
noted that AMRs use generalized semantic roles arg0,…, 
arg5 employed in PropBank framesets [1]. 
Example 1 [1]. The sentence “The man described the 
mission as a disaster” can be associated with the AMR 
(d/describe-01 :arg0 (m/man) :arg1 (m2/mission) 
:arg2 (d/distance)). 
Reason 2. The possibility to build compound 
designations of various entities from application domains. 
Example 2 [1]. The expression “a singing boy from 
the college” can be associated with the AMR 
(b/boy :arg0-of (s/sing-01) :source (c/college)). 
Reason 3.  A way of describing semantic structure of 
sentences with infinitive constructions. 
Example 3 [35]. Let T1 = “The boy wants to go to 
New York”. Then T1 may have the following AMR: 
(w/want-01 :arg0 b/boy  :arg1 g/go-01 
                         :arg0 b  :arg1 c/city  
                         :wiki “New York” 
                         :name (n/name :op1 “New” 
                                                 :op2 “York”)). 
Reason 4. The possibility to describe semantic 
structure of sentences with modal words and infinitives. 
Example 4 [1]. The sentences “The boy doesn’t have 
to go”, “The boy isn’t obligated to go”, and “The boy need 
not go”  may be associated with the AMR 
(p/obligate-01 :arg2 (g/go-01)  
                       :arg0 (b/boy) :polarity -)). 
Another reasons are the possibilities to describe 
semantic structure of (a) the questions with interrogative 
words; (b) noun groups (e.g., “Elsevier N.V., the Dutch 
publishing group”), (c) sentences expressing the 
conceptual qualification relation (“This woman is a 
lawyer”, etc.). 
It is possible to distinguish the following principal 
shortcomings of the AMR notation from the standpoint of 
using it in the models and algorithms of semantics-
oriented NL processing. 
1. Our linguistic intuition says that (a) the main 
words and word combinations of the sentences 
refer to various things, situations, and abstract 
entities; (b) there are various directed semantic 
connections between the fragments of the 
sentence, in particular, between such main words 
and word combinations. A directed graph with 
special marks of the vertices and edges is the 
structure visualizing quite well this perception of 
a sentence by our linguistic intuition. However, 
this product of scientific thought can be 
characterized as a surface, non deep penetration 
into the mechanisms of NL semantics. That is 
why the AMR notation makes only a rather small 
contribution to the creation of the models 
reflecting the essence of sentence understanding 
with respect to a knowledge base. 
2. The linguistic intuition of the scholars (not only 
of linguists) having command of several natural 
languages (e.g., of Russian and English or of 
English, French, and German) says that there are 
several mental mechanisms underpinning the 
construction of NL semantic structures in 
different languages. For instance, English, 
Russian, French, and German do have infinitive 
constructions and compound descriptions of sets. 
However, the AMR approach doesn’t formulate 
any conjecture about a system of expressive 
mechanisms being responsible for constructing 
mental representations of sentences even in one 
language – in English. 
224 Informatica 41 (2017) 221–232 V.A. Fomichov  
 
3. Due to the above said, the AMR approach doesn’t 
give a special formal status to such constructions 
as semantic images of infinitive expressions, 
compound designations of sets, and of sentences 
with modality. That is why the AMR approach 
seems to be of small use for constructing 
semantics-oriented models of NL 
communication. 
4. The group of general semantic relations used in 
AMR seems to be a huge bag containing, in 
particular, such relations of different kinds as 
:age, :destination, :consist-of, and :purpose. The 
first unit is the name of a function, the second – 
fourth units are the names of the relations being 
not functions. These principal peculiarities are 
not taken into account by the AMP approach. 
5. The AMR approach says nothing about the SRs 
of discourses. 
 
3 Task statement 
It seems to be reasonable to analyse the new demands to 
computational semantics in the context of the problems 
faced by computational linguistics (CL) as a whole. 
The analysis of many publications describing the 
projects on NLP shows the existence of a gap (very often, 
a huge gap) between the employed theoretical tools and 
the real demands of the studied problems. Let’s consider 
only one example. The linguistic processor BLUE (= 
Boeing Language Understanding Engine) was developed 
as an advanced information processing tool for the Boeing 
company. The system is able to build SRs of sentences of 
many kinds. In first section of one of the papers describing 
BLUE the authors state that the system uses the formal 
means of first-order logic (FOL) for constructing SRs of 
sentences [8]. However, we get to know from the second 
section of the same paper that the system BLUE “allows 
propositions to themselves be arguments to other 
propositions as a nested structuring”. For instance, the 
system constructs an SR of the sentence “The man wanted 
to leave the house”. 
This step immediately leads us beyond the scope of 
FOL. The reason is that atomic formulas of FOL can’t 
include the arguments being formal semantic images of 
infinitive constructions (“to leave the house”, etc.). That is 
why the Boeing system BLUE, in fact, has no adequate 
theoretical background. 
Analysing the development of CL as a whole during 
last twenty five years, it is possible to observe a shift to 
numerous engineering projects for solving particular 
practical tasks and the lack of attention to fundamental 
studies. 
It seems that one of the brightest descriptions of recent 
and current situation in CL  is given by Dr. Shuly Wintner 
from Computer Science Department of the University of 
Haifa, Israel [57]. The starting point for Dr. Wintner was 
high appreciation of the role played by mathematical 
theories in the development of many branches of 
engineering. For instance, air dynamics underpins the 
design of airplanes, and hydrodynamics is the basis for 
constructing ships. In this connection the following 
questions were posed by Dr. Wintner: 
“What branch of science underlines NL Engineering? 
What is the theoretical infrastructure on which we build 
our applications? And what kind of mathematics is 
necessary for reasoning about human languages?” 
It would be very natural to expand this list of 
fundamental questions by means of adding the question 
posed in [36]: “How to formally represent the semantics 
of language?”. 
The need of developing a comprehensive formal 
framework for creating an MSW makes very up-to-date 
the question about mathematical foundations of 
computational semantics being the core of modern CL. 
The analysis shows that the current state of 
computational semantics demands the development of an 
applications independent semantic formalism being 
convenient: (a) for describing semantic structure of 
sentences including, in particular, infinitive and gerundial 
(for English) constructions expressing the goals 
commitments, commands, wishes, etc, the attributive 
clauses of purpose, complex direct and indirect speech, 
compound denotations of notions and sets; (b) for 
presenting semantic structure of discourses, in particular, 
of discourse with the references to the meanings of 
previous sentences or larger fragments of the text; (c) for 
building representations of knowledge pieces, including 
the definitions of notions; (d) for constructing formal 
representations of simple and compound goals of people, 
robots, and organizations. 
     This combination of expressive mechanisms is not 
proposed by FOL, Discourse Representation Theory, 
Theory of Conceptual Graphs, Episodic Logic [48], and 
Abstract Meaning Representation. 
     It is also possible to look at the formulated task from a 
more general position. The analysis of the scientific 
literature on semantic parsing and an MSW provides 
serious arguments in favour of putting forward the 
following conjecture: it is high time for creating a new 
paradigm for considering numerous theoretical problems 
encountered while constructing and processing various 
conceptual structures associated with Web-based 
informational sources: semantic representations of written 
and spoken texts’ fragments (in other terms, text meaning 
representations); high-level conceptual descriptions of 
visual images; knowledge pieces stored in ontologies; the 
content of messages sent by computer intelligent agents, 
etc. 
   How to find a key to solving this problem? We do 
know that, using NL, we are able to describe various 
pieces of knowledge, the semantic content of a visual 
image, the semantic content of a film, etc. That is why it 
can be conjectured that a key to elaborating a new 
paradigm of the described kind could be the construction 
of a broadly applicable and flexible Conceptual 
Metagrammar. It is to be a collection of the rules (or partial 
operations) enabling us to construct step by step an SR of 
practically arbitrary sentence or discourse pertaining to 
mass spheres of professional activity of people. In [29], 
SK-languages as a Powerful and Flexible Semantic... Informatica 41 (2017) 221–232 225 
the term “a comprehensive semantic formal environment” 
is used in the same sense. 
  The prefix “meta” in the term “metagrammar” 
means that such rules are to use the information associated 
with the classes of conceptual units. That is why we should 
be able to employ the same system of rules with different 
conceptual vocabularies. 
4 Theory of K-representations as a 
source of a broadly applicable and 
flexible semantic formalism 
Happily, a solution to the formulated problem is already 
available. It is given by the theory of K-representations 
(TKR). It should be underlined that its approach to 
describing semantic structure of NL-texts is free from the 
listed shortcomings of AMR. 
In order to better understand the peculiarity of TKR, 
let’s establish an analogy with bionics. Bionics studies the 
peculiarities of the structure and functioning of the living 
beings in order to discover the new ways of solving certain 
technical problems. TKR was developed as a consequence 
of fulfilling a system analysis of the basic expressive 
mechanisms of NL and putting forward a conjecture about 
a system of partial operations on conceptual structures 
underpinning these expressive mechanisms. 
4.1 Two versions of a broadly applicable 
and flexible conceptual metagrammar 
The first basic constituent of TKR is two versions of a 
mathematical model (Model 1) describing a system of ten 
partial operations on conceptual structures. The first 
version (Model 1-A) is published in [17]. It should be 
noticed that the 9th operation introduced in [17] is 
modified in [18]. Model 1-A is the kernel of the theory of 
restricted standard knowledge languages (RSK-
languages). The predecessor of this theory is the theory of 
S-calculuses and S-languages (see [11] and a retrospective 
outline in Section 2.3 of [25]). The second version (Model 
1-B) is published in the monographs [21, 25] and is the 
kernel of the theory of standard knowledge languages 
(SK-languages). 
Each version of the Model 1 gives us formal means 
being convenient for building SRs of, likely (it is a 
hypothesis), arbitrarily complex sentences and discourses 
in NL pertaining to mass spheres of professional activity 
(engineering, business, medicine, etc.). 
The difference between the Models 1-A and 1-B is as 
follows. Model 1-A allows us to proceed from only one 
angle of look at an entity from the considered thematic 
domain. To the contrary, Model 1-B makes it possible to 
consider an entity from several possible angles of look. 
  Example. Both Model 1-A and Model 1-B consider 
a finite set of symbols St and the countable non-
intersecting sets of symbols X and V. The elements of the 
sets X and V are interpreted respectively as primary 
informational units and variables. The set St (its elements 
are called sorts) is a subset of X. Suppose also that the 
Model 1-A includes a mapping tp1 from the union of X 
and V into the countable set of symbols Types1, and the 
Model 1-B includes a mapping tp2 from the union of X 
and V into the countable set of symbols  Types2. Here 
Types1 and Types2 contain the symbols and strings 
interpreted as semantic characteristics of entities from the 
considered domains. Both Types1 and Types2 include the 
subset of sorts St, and Types1 is a subset of Types2. 
Suppose that X includes the unit D.Mendeleev, it 
denotes the famous Russian chemist Dmitry I. Mendeleev, 
the author of the periodical table of elements. Let St 
include the sorts ints and dyn.phys.ob (“intelligent system” 
and “dynamic physical object”). Then it is possible that 
either tp1(D.Mendeleev) = ints or tp1(D.Mendeleev) = 
dyn.phys.ob, but tp2(D.Mendeleev) = ints * dyn.phys.ob. 
Subsection 4.3 very shortly, without numerous 
mathematical details, characterizes ten partial operations 
from Model 1-A and Model 1-B. Due to a very general 
level of discussion, the material of Subsection 4.3 
illustrates the partial operations both from Model 1-A and 
Model 1-B. Due to the lack of mathematical details, the 
shortly described operations may seem to be very simple. 
However, Model 1-A and Model 1-B are strictly 
mathematical models, they define respectively new 
classes of formal languages: the classes of RSK-languages 
and SK-languages. These models were developed due to 
the invention of an original methodology of constructing 
inductive definitions of formal objects with complex 
structure (see [17, 25]). 
The analysis of the scientific literature on artificial 
intelligence theory, mathematical and computational 
linguistics shows that today the class of SK-languages 
opens the broadest prospects for building semantic 
representations (SRs) of NL-texts (i.e., for representing 
meanings of NL-texts in a formal way). 
4.2 The models of linguistic database and 
algorithms of semantic parsing 
The second basic constituent of TKR is two broadly 
applicable mathematical models of a linguistic database 
(LDB) [21, 25].  The models describe the frames 
expressing the necessary conditions of the existence of 
semantic relations, in particular, in the  word combinations 
of the kinds “Verbal form (verb, participle, gerund) + 
Preposition + Noun”, “Verbal form+ Noun”, “Noun1 + 
Preposition + Noun2”, “Noun1 + Noun2”, “Number 
designation + Noun”, “Attribute + Noun”, “Interrogative 
word + Verb”. The expressive power of SK-languages 
enables us to associate the lexical units with the 
appropriate simple or compound semantic units. The 
models describe the logical structure of LDB being the 
components of NL-interfaces to intelligent databases as 
well as to other applied computer systems.  
     The third basic constituent of TKR is several 
complicated, strongly structured algorithms carrying out 
semantic parsing of texts from some practically interesting 
sublanguages of NL. The first and second algorithms, 
called SemSyn and SemSynt1 respectively, are based on 
the elaborated formal models of LDB.   The algorithm 
SemSyn [21]  transforms a NL-text in its  SR  being a K-
representation, the algorithm SemSyn is described in two 
226 Informatica 41 (2017) 221–232 V.A. Fomichov  
 
final chapters of the monograph [21], and the algorithm 
SemSynt1 is set forth in Chapters 9 and 10 of the 
monograph [25]. 
     An important feature of these algorithms is that they 
don’t construct any syntactic representation of the inputted 
NL-text but directly find semantic relations between text 
units. Since numerous lexical units have several meanings, 
the algorithm uses the information from a linguistic 
database and linguistic context for choosing one meaning 
of a lexical unit among several possible meanings. 
     The other distinguished feature is that these structured  
algorithms are completely described with the help of 
formal tools, that is why they are problem independent and 
don’t depend on a programming system. The algorithm 
SemSyn is implemented in the programming language 
PYTHON.  Additional information about the algorithms 
of semantic parsing proposed by TKR can be found in 
Section 5. 
4.3 About ten partial operations on 
conceptual structures 
The expressions of SK-languages will be called below the 
K-strings.   If Expr is an expression in NL and a K-string 
Semrepr can be interpreted as an SR of Expr, then Semrepr 
will be called a possible K-representation (KR) of the 
expression Expr. 
     The KRs of NL-texts are formed from the primary 
informational units, the variables, and several service 
symbols by means of an iterative process of applying the 
operations of building well-formed formulas Op[1], …, 
Op[10]. The initial set of simplest formulas is determined 
by a special formal object called a conceptual basis (c.b.) 
and playing the role of the simplest knowledge base [21, 
25]. The language determined by the considered c.b. B and 
the operations Op[1], … Op[n] (they are defined by the 
special statements, or rules, P[1], …, P[10]) is denoted as 
Ls(B) and is called the standard knowledge language (SK-
language) in the basis B [21, 25]. 
The rule P[0] provides an initial stock of formulas. 
For example, if the string mouse1 is an element of a certain 
primary informational universe X(B), then mouse1 is a 
formula of Ls(B). 
For arbitrary c.b. B, let Degr(B) be the union of all 
Cartesian m-degrees of Ls(B), where m is not less than 1. 
Then the meaning of the rules of constructing well-formed 
formulas P[1], ..., P[10] can be explained as follows: for 
each k from 1 to 10, the rule P[k] determines a partial 
unary operation Op[k] on the set Degr(B) with the value 
being an element of Ls(B). 
Let’s consider a short introduction to the partial 
operations for constructing formal representations of 
structured meanings Op[1], …, Op[10]. 
The operation Op[1] can be used to join intentional 
quantifiers to the designations of the notions and produce 
the formulas like 
certain car, certain car * (Manufacturer, IBM), all 
car * (Manufacturer, BMW). 
The operation Op[2] can be used to construct the 
formulas like f(a1, …, an), where f is a functional symbol, 
and a1, …, an are the well-formed formulas of Ls(B). For 
example, Area (certain country) is a well-formed formula 
of a certain  SK-language Ls(B). The operation Op[3] can 
be used to construct the expressions of the form (a ≡ b). 
E.g.,  (Area (certain country) ≡ x12). 
The operation Op[4] can be used to construct the 
formulas like rel (a1, …, an), where rel is a relational 
symbol, and a1, …, an are the formulas of Ls(B). E.g., 
Less(Area (certain country), 600,000/sq.km). 
The operation Op[5] allows us to mark KRs by some 
variables from the set of variables. For example, if a part 
of a KR looks like certain file1 * (Extension, “.docx”) : 
v1,  then we can refer to the expression certain file1 * 
(Extension, “.docx”) in another part of a K-representation, 
using v1. 
The operation Op[6] provides the possibility to 
construct K-representations in the form ¬Formula, for 
example ¬ car. The operation Op[7] allows us to use 
conjunction and disjunction in the formulas, e.g., 
(airplane   helicopter), (mathematician  painter). 
 The operation Op[8] can be used to build compound 
designations of the notions in the form 
concept * (r1, value1)  … (rn, valuen) , 
where concept is an element of a primary informational 
universe X(B) denoting a notion, r1, … rn  are the names 
of functions or relations, and the value1 ,…,  valuen are 
well-constructed formulas. This operation allows us to 
construct the formula country *(Location, 
Europe)(Capital, Vienna) being a KR of the expression “a 
country in Europe with the capital Vienna”. 
The operation Op[9] allows us to use quantifiers  
and   like in FOL. The operation Op[10] enables us to 
build the  representations of ordered n-tuples as the 
expressions of the form <a1, … an >,  where a1, … an are 
some well-constructed formulas. E.g., this operation can 
be used to construct the KR <Place, Backup-drive>, 
<Time, Midnight>, <Frequency, Everyday>. 
These n-tuples could be used to construct 
representations of complex verb constructions. For 
example: Delete(<Object, all file1*(Size, 0)(Extension, 
".txt")>, <Time, Midnight>, <Frequency, Everyday>).  
4.4 SK-languages as a tool of describing 
semantic structure of sentences 
Before to consider below a number of examples 
illustrating a correspondence between an expression in NL 
and its possible KR, let’s agree that the string Semrepr is 
to be interpreted as a possible KR  of the regarded 
expression in NL. 
Compound semantic descriptions of objects and 
sets of objects. The key role is played by the interaction 
of the operations Op[8], Op[1], and Op[5]. Using the 
operation Op[8] at the last step of constructing a formula 
and any of the operations Op[1], …, Op[10] at the 
previous steps, it is possible to construct an expression of 
the form conc * (rel1, d1)…(reln, dn), where conc is a 
simple (non-structured) designation of a notion, n ≥ 1, for 
k = 1,…, n  relk either is a name of the function with one 
argument or the name of a binary relation. In the first case 
dk   designates the value of the function relk   and in the 
SK-languages as a Powerful and Flexible Semantic... Informatica 41 (2017) 221–232 227 
second case   dk   designates the second attribute of the 
relation relk . 
Applying consequently the operations Op[1] and 
Op[5], we can obtain an expression of the form 
 qtr conc * (rel1, d1)…(reln, dn): var, 
 where qtr is an intensional quantifier (in particular, it may 
correspond to the meanings of the words and expressions 
“a certain”, “any” “all”), var is a variable. 
Example 1. We can construct compound designations 
of the entities mentioned in texts. For example, the 
expression “a French textbook on biology” may be 
associated with the semantic image 
 certain textbook1 *(Country, France)(Activity-field, 
biology) : x15. 
Example 2. It is possible to build compound 
designations of the mentioned sets, e.g., certain set1 * 
(Number-of-elements, 4)(Qualitative-composition, 
container1 * (Content1, ceramics * (Country-producer, 
(India OR China)))) : S7, where set1 designates the notion 
“finite set”. 
Building semantic representations of compound 
infinitive constructions. 
Example 1. Let Goal1 = “To receive a M.Sci. degree 
in business informatics at the Higher School of Economics 
(Moscow) and to found a company on e-business”.      
Then a possible K-representation of Goal1: 
    (receiving1 * (Institution-role, certain university * 
(Name1, “Higher School  of Economics”)(Location, 
certain city * (Name1, “Moscow”) : x1)) 
(Document-role, certain acad-degree *  
(Kind, M.Sci.)(Field1, business-informatics) : x2)  
 founding1 * (Organization-role, certain  
firm1 * (Field1, e-business) : x3)) 
     Representation of the meanings of sentences with 
indirect speech. Let T1 = “When Mr. Peter Smith 
announced that he would visit Montpelier in April?”. Then  
Semrepr   =   Question  (t1,  Situation  (e1,   informing1   
*  (Time,  certain mom * (Before, #now#) : t1)  (Agent1, 
certain man * (First-name, “Peter”)(Surname, “Smith”) 
:  x1)(Inform-content, Situation (e2, visit1 * (Agent1, 
x1)(Location2, certain city * (Name1, “Montpellier”) : 
x2)(Time, Nearest-month-future (April, #now#))) ))). 
     Representing the meanings of sentences with 
subordinate clauses of purpose. Let T2 = “Mr. Peter 
Smith, a Vice-President of the firm “Rainbow”, 
announced yesterday that he would visit  Montpelier in 
April in order to sign an agreement with the company 
“CIRAD”. Then 
Semrepr   =   Situation  (e1,   informing1   *  (Time,  
Previous-day ( #now#))  (Agent1, certain man * (First-
name, “Peter”)(Surname, “Smith”) :  x1)(Inform-content, 
Situation (e2, visit1 * (Agent1, x1)(Location2, certain city 
* (Name1, “Montpellier”) : x2)(Time, Nearest-month-
future(April, #now#))(Goal, signing2 * (Inform-object, 
certain agreement1 : x3 )(Business-partner, certain 
company1 * (Name1, “CIRAD”) : x4))))). 
     Semantic representation of the homogeneous 
members of sentence. Let T3 = “Jean would ike to visit 
during this summer either Vienna, Bratislava, and Prague 
or Bergen, Oslo, and Stockholm”. Then 
Semrepr   =   Situation  (e1,   intention   *  (Time,   #now#)  
(Emotional-agent, certain man * (First-name, “Jean) :  
x1)(Goal, visit1 * (Time, Nearest-season(summer, 
#now#))(Location2, ((certain city * (Name1, “Vienna” : 
x2)    certain city * (Name1, “Bratislava” : x3)    certain 
city * (Name1, “Prague” : x4)   (certain city * (Name1, 
“Bergen” : x5)    certain city * (Name1, “Oslo” : x6)    
certain city * (Name1, “Stockholm” : x7)))))). 
     Semantic descriptions of the expressions with the 
words “a notion”, “a term”. Let S1 = “The term gene 
was first coined in 1909 by a Danish botanist, Johannsen, 
and was derived from the term pangen introduced by De 
Vries. Then  
Semrepr1   =   Situation  (e1,   introduction1 * (Notion-
name, certain notion * (called, “gene”) : c1)(Agent1, 
certain botanist1*  (Surname, “Johannsen”)(Country-
role, Denmark) : x1)(Time, 1909))  Situation  (e2,   
derivation1 * (Notion-name, c1)(Agent1, x1)(Source-
notion, certain notion * (Called, “pangen”) (Authorship, 
certain person * (Surname, “De Vries”) : x2). 
4.5 SK-languages as a tool of describing 
semantic structure of discourses and 
representing knowledge pieces  
Example 1.  Let Disc = S1. S2, where S2 = “This 
information is given in the textbook “Emery’s Elements 
of Medical Genetics” by D. Turnpenny and S. Ellard, its 
12th edition was published by Elsevier in 2005”. Then Disc 
may have a KR of the form  
(Semrepr1 : P1    Information-source (P1, 
Semrepr2)), where Semrepr2 is the following possible KR 
of the sentence S2 : 
certain textbook1 * (Title, “Emery’s Elements of 
Medical Genetics”)(Authorship, (D. Turnpenny   S. 
Ellard))(Edition-number, 12)(Publishing-house, 
Elsevier)(Year, 2005) : x3. 
     Here P1 is the variable marking the meaning of the 
first phrase of the text Disc. 
     Example 2.  Let Def = “Control gene is a gene which 
can turn other genes on or off”. Then 
Semrepr3   =   (Control-gene ≡ gene * (Is-able, (turning-
on * (Object-bio, some gene : Set1)   turning-off * 
(Object-bio, Set1)))). 
     Example 3.  It is possible to construct a different KR 
of the definition Def, it will reflect the metadata of 
information piece, indicating the edition, the authors, and 
year of publication. In this case 
Semrepr-with-metadata = certain inform-object * 
(Content1, Semrepr3)(Authorship,  (D.Turnpenny  
S.Ellard))(Publishing-house,  Elsevier)(Year, 2005) 
(Title, “Emery’s Elements of Medical 
Genetics”)(Edition-number, 12). 
5 Principal distinctive features of 
two original approaches to 
semantic parsing 
The theory of K-representations not only introduced a new 
class of formal languages (the class of SK-languages) for 
228 Informatica 41 (2017) 221–232 V.A. Fomichov  
 
building SRs of complex sentences and discourses. It also 
used the definition of this class of formal languages as a 
starting point for developing two  broadly applicable 
mathematical models of a linguistic database ([22], 
Chapter 6 of [21], and Chapter 7 of [25])) and an original 
method of extracting structured meanings from NL-texts 
(Chapter 8 of [25]). We use this term for denoting a 
method of developing the multilingual algorithms of 
semantic-syntactic analysis of texts in NL. Such 
algorithms transform the texts from certain sublanguages 
of NL into SRs (in other terms, text meaning 
representations). For building SRs, the class of SK-
languages is used. The input texts may be at least from 
broad and practically interesting sublanguages of English, 
German, and Russian languages. 
     The proposed method underpinned the development of 
a multilingual algorithm of semantic parsing SemSynt1 
(Chapters 9 and 10 of [25]). It is the composition of two 
algorithms called BuildMatr1 and BuildSem1. The 
algorithm BuildMatr1 can be qualified as an original 
algorithm of semantic role labeling. The input texts may 
be the questions of many kinds, the commands, the 
sentences, and the discourses. The output of BuildMatr1 
(more exactly, its principal part) is a special string-digital 
matrix Matr called a matrix semantic-syntactic 
representation (MSSR) of the input text. The matrix Matr 
is dynamically linked with an auxiliary data structure 
being a two-dimensional array Arls. In case an elementary 
meaningful text unit (or a token) wd has N different 
meanings, the array Arls will include N consequent rows, 
where for k = 1, …, N the N-th row stores the information 
associated with the k-th meaning of wd. 
     The configuration of an MSSR Matr changes during 
semantic-syntactic processing of the input text. Each 
configuration determines, in particular, a marked oriented 
graph with the vertices being the distinguished elementary 
meaningful text units (or tokens) and a mapping from the 
subset of the vertices of this graph corresponding to lexical 
items to the set of meanings (or values) associated with 
these lexical items via the array Arls. Before the start of 
text’s processing, an edge from each lexical unit wd goes 
to the first row of Arls (that is, the row with the minimal 
ordered number) storing the semantic units associated 
with wd. 
Figure 1 illustrates this situation for processing the 
command “Download the green container on the 
platform”. Here V1[1] is the value downloading1 
(downloading a file), V1[2] is the value downloading2 
(downloading a transportable physical object); V2[1] is 
the value green-colour, V2[2] is the value not-ripe, V2[3] 
designates the value a-member-of-green-movement; 
V3[1] is the value thing-container; V3[2] is the value 
data-structure-of-RDF; V4[1] is the value computer-
platform, V4[2] is the value railway-station-platform,  
V4[3] is the value political-platform. Figure 2 illustrates 
the final situation. 
The output of the algorithm BuildMatr1 is the input of 
the algorithm BuildSem1. It transforms the information 
represented by an MSSR Matr of the input text into its 
possible SR. It is a KR of the input text. 
     Example. The command “Download the green 
container on the platform” can be associated with its 
possible KR of the form  
Command (#Operator#, #Executor#, #now#, 
downloading2 * (Object1, certain thing-container * 
(Colour, green) : x1)(Destination, certain railway-
station-platform : x2)). 
     The paper [44] expands the method introduced in 
Chapter 6 of [25]. On the one hand, the input language of 
the algorithm BuildMatr1 is enriched by means of the 
phrases expressing (a) the values of functions, (b) the 
restrictions of the functions’ values, (c) the relations 
between various objects formed with the help of 
comparative adjectives.  
On the other hand, it is well known that many notions 
corresponding to the words and word combinations from 
NL-texts are too general in order to be used for the 
interaction with a database. For instance, these are the 
concepts “IT-specialist” and “alumni”. That is why it is 
proposed to use for semantic parsing of NL-texts not only 
a linguistic database but also a linguistic knowledge base 
(LKB). It may consist of the K-strings of the form 
illustrated by the following example: 
(IT-specialist ≡ person * (Qualification, (programmer   
database-administrator   web-programmer)). 
Let’s call unfolding concepts the concepts being the left 
parts of some expressions in the LKB. The proposed final 
step of processing NL-texts is to replace all semantic items 
from the constructed primary SR belonging to the subclass 
of unfolding concepts by the less general concepts with 
the help of the definitions stored by the used LKB (it may 
 
Figure 1: Initial graph and mapping determined by an 
MSSR Matr. 
 
Figure 2: Final graph and mapping determined by an 
MSSR Matr. 
download the containergreen the platform
Tokens
Fragments of 
the array Arls
V1[1] V3[1]V2[1] V4[1]V1[2] V3[2]V2[2] V4[2]V2[3] V4[3]
V1[1] V3[1]V2[1] V4[1]V1[2] V3[2]V2[2] V4[2]V2[3] V4[3]
download
Tokens
Fragments of 
the array Arls
the containergreen the platform
SK-languages as a Powerful and Flexible Semantic... Informatica 41 (2017) 221–232 229 
be interpreted as a part of ontology). E.g., the concept “IT-
specialist” will be replaced by the compound concept 
 person * (Qualification, (programmer   database-
administrator   web-programmer)). 
The paper [45] introduces a highly compact way of 
describing formal structure of linguistic databases 
(semantic-syntactic component) and of presenting the 
algorithms of semantic parsing. The paper contains the 
algorithm of semantic parsing SemSyntRA, developed 
under the framework of the proposed approach (see also 
next section). 
6 Applications of the K-
representations theory 
The arguments stated above and numerous additional 
arguments set forth in the monograph [25] give serious 
grounds to conclude that the class of SK-languages, 
provided by TKR, can be interpreted as the first 
comprehensive semantic formal environment for studying 
various semantics-associated problems of developing an 
MSW. 
It seems to be reasonable to say about two levels of 
applying TKR to solving practical tasks. The first level is 
the direct use in the design of NL processing systems of a 
mathematical model of a linguistic database introduced in 
Chapter 7 of the monograph [25] and of the algorithm of 
semantic parsing SemSynt1 described in Chapters 9 and 10 
of the same monograph. This algorithm is multilingual: its 
input texts may be the questions of many kinds, 
statements, and commands from the sublanguages of 
English, German, and Russian languages. The mentioned 
model and algorithm were applied by the author and his 
Ph.D. students to the design of an NL-interface of a 
recommender system [41], to the design of an advanced 
semantic search system [30], and to the design of an NL-
interface of an applied intelligent system making easier 
the interaction of a user with the file system of a computer 
[44, 45]. Two versions of this system are called NLC-1 
[44] and NLC-2 [45] (here NLC = Natural Language 
Commander). 
     Example.  Let’s look how NLC-1 processed the 
following user instruction: “Copy music files from 
“Download” folder to folder with name “Music” or “My 
music” on backup drive if their size is less than 1 GB”. 
This instruction has the following primary K-
representation constructed by SemSynt2 – a modification  
of the algorithm SemSynt1: 
If-then(Less(SizeOf(all music1*(Place1, certain folder1 * 
(Name1, "Download")):o1), 1/GB), Command 
(#Operator#, #Executor#, #now#,  copying*(Source1, 
o1)(Destination1, certain folder1 * (Name1, ("Music"   
"My music"))(Place1, certain backup-drive)))).  
Now if knowledge base of NLC-1 contains the K-
strings (music1 ≡ file1 * (Extension, ("mp3"   "ogg"   
"wav"   “aac")), (backup-drive ≡ drive1 *(Name1,"F")) 
and knowledge management system includes the rule (x, 
(x ≡ y) ├ y),  then NLC-1 transforms the constructed 
primary KR of user instruction into the secondary KR  
If-then(Less(SizeOf(all file1*(Extention, ("mp3"   
"ogg"   "wav"   "aac"))(Place1, certain folder1 * 
(Name1, "Download")) : o1), 1/GB), 
Command((#Operator#, #Executor#, #now#,  
copying*(Source1,o1)(Destination1, certain folder1 
*(Name1, ("Music"   "My music"))(Place1, certain 
drive1 * (Name1, "F"))))). 
Then the result shell script is 
 if [ $(du -cb "Download/*.mp3" "Download/*.ogg" 
"Download/*.wav" "Download/*.acc"|grep total|sed -e 
"s/\s.*$//g") -le 1000000000 ]; then cp 
"Download/*.mp3" "Download/*.ogg" 
"Download/*.wav" "Download/*.acc" $(ls /f/|grep -iE 
"^Music$|^My music$" / head -n1); fi. 
 
Written in Haskell programming language, NLC-1 is 
a flexible and scalable application. It can be configured by 
a researcher for different domains and underlying shells. 
The paper [45] describes a modified theoretical foundation 
of the second version NLC-2. 
The great advantages of the proposed comprehensive 
semantic formal environment are promised by the second 
level of applications: it is the case of using SK-languages 
for describing lexical semantics, representing semantic 
content of sentences and discourses in NL, building 
models of advanced ontologies, constructing semantic 
annotations of Web-documents (see Section 6.2 of [25]),  
and forming high-level conceptual descriptions of visual 
images (see Section 6.3 of [25]) in numerous scientific 
centres and research groups throughout the world. 
7 A contribution to developing a 
Multilingual Semantic Web 
The process of endowing the existing Web with the ability 
of understanding many natural languages is an objective 
ongoing process. The analysis has shown that there is a 
way to increase the total successfulness, effectiveness of 
this global decentralized process. It would be especially 
important with respect to the need of cross-language 
conceptual information retrieval and question - answering. 
The way proposed in [25-29] is a possible new paradigm 
for the mainly decentralized process of endowing the 
existing Web with the ability of processing many natural 
languages. 
   The principal idea of a new paradigm is as follows. 
There is a common thing for the various texts in different 
natural languages. This common thing is the fact that the 
NL-texts have the meanings.  The meanings are associated 
not only with NL-texts but also with the visual images 
(stored in multimedia databases) and with the pieces of 
knowledge from the ontologies. 
   That is why the great advantages are promised by 
the realization of the situation when a unified formal 
semantic environment is being used in different projects 
throughout the world for reflecting structured meanings of 
the texts in various natural languages, for representing 
230 Informatica 41 (2017) 221–232 V.A. Fomichov  
 
knowledge about application domains, for constructing 
semantic annotations of informational sources and for 
building high-level conceptual descriptions of visual 
images. 
The analysis of the expressive power of SK-languages 
(see the chapters 3 – 6 of [25] shows that the SK-languages 
can be used as a unified formal semantic environment of 
the kind. This idea underlies an original strategy of 
transforming step by step the existing Web into a Semantic 
Web of a new generation, where its principal distinguished 
feature would be the well-developed ability of NL 
processing; it can be also qualified as a Multilingual 
Semantic Web. The versions of this strategy are published 
in [25-29]. 
8 Conclusion 
Computational semantics has received a firm theoretical 
ground. The SK-languages, introduced  by the theory of 
K-representations, open new prospects for formalizing 
lexical semantics, representing semantic content of 
sentences and discourses in NL, building models of 
advanced ontologies, forming high-level conceptual 
descriptions of visual images, and constructing semantic 
annotations of Web-documents in numerous scientific 
centres and research groups. Many existing projects on NL 
processing including semantic parsing have received an 
appropriate theoretical framework for next stages of 
research. For an MSW it is also very important that SK-
languages provide a convenient intermediary level for 
moving from NL input to OWL-based ontologies. 
This paper provides additional arguments in favour of 
the conjecture formulated in [24-29]: TKR can be and 
should be used as a comprehensive and flexible basic 
formal tool for solving the tasks of developing an MSW 
associated with semantics of NL. 
References 
[1] Banarescu, L., Bonial, C., Cai, S., Georgescu, M., 
Griffitt, K., Hermjakob, U., Knight, K., Koehn, P., 
Palmer, M., Schneider, N. (2013).  Abstract Meaning 
Representation for Sembanking. In: Proceedings of 
the 7th ACL Linguistic Annotation Workshop and 
Interoperability with Discourse, Sofia, Bulgaria, 
August 8-9, 2013 
(2013)(www.aclweb.org/anthology/W13-2322; 
retrieved 2016-03-12) 
[2] Banarescu, L., Bonial, C., Cai, S., Georgescu, M., 
Griffitt, K., Hermjakob, U., Knight, K., Koehn, P., 
Palmer, M., Schneider, N. (2015).   Abstract 
Meaning Representation (AMR) 1.2.2 Specification; 
github.com/amrisi/amr-
guidelines/blob/master/amr.md. 
[3] Blanco E., Moldovan D. (2014). Leveraging verb-
argument structures to infer semantic relations. In 
Proceedings of the 14th Conference of the European 
Chapter of the Association for Computational 
Linguistics, Gothenburg, Sweden, April 26-30, 2014. 
ACL, pp. 145-154. 
[4] Bordes, A., Glorot, X., Westion, J., Bengio, Y. 
(2012). Joint learning of words and meaning 
representations for open-text semantic parsing. Proc. 
of the 15th Intern. Conf. on Artificial Intelligence and 
Statistics (AISTATS) 2012, LasPalmas, Canary 
Islands. Vol. 22, pp. 127-135. 
[5] Buitelaar P., Choi K.-S. Cimiano P., Hovy E. H. 
(Eds.)(2012). Report from Dagstuhl Seminar 12362 
“The Multilingual Semantic Web” (2 – 9 September, 
2012). Schloss Dagstuhl: Leibniz-Zentrum fuer 
Informatik. 
[6] Cambridge Semantics Inc., The Smart Data 
Company, Web-page (retrieved 14.10.2016).;  
http://www.cambridgesemantics.com/semantic-
university/nlp-and-semantic-web#.. 
[7] Cimiano, P., Haase, P. et al. (2008). Towards 
Portable Natural Language Interfaces to Knowledge 
Bases – the Case of the ORAKEL System. Data and 
nowledge Engineering,  Vol. 65, No. 2., pp. 325-354. 
[8] Clark P., Harrison P. (2008). Boeing’s NLP System 
and the Challenges of Semantic Representation. In: 
Proc. SIGSEM Symposium on Text Processing 
(STEP’08), Venice, Italy, ACL, pp. 263-276. 
[9] Das, D., Chen, D., Martins, A. F. T., Schneider,  N., 
Smith, N. A. (2014). Frame-Semantic Parsing.  
Computational Linguistics,  Vol. 40, No. 1, pp. 9-56. 
[10] Fillmore C., Johnson C. R., Petruck M. R. L. (2003). 
Background to FrameNet. International Journal of 
Lexicography, Vol. 16, No. 3, pp. 235-250. 
[11] Fomitchov, V. A. (1984). Formal systems for natural 
language man-machine interaction modelling. 
Artificial Intelligence. Proc. of the IFAC 
Symposium. Leningrad, USSR, 4-6 Oct. 1983, 
Ponomaryov, V.M. (ed.), Oxford, UK, Pergamon 
Press Ltd., New York, Pergamon Press Inc., 1984, 
pp. 203-207 (IFAC Proc. Series, 1984, No. 9). 
[12] Fomichov, V. A. (1988) Representing Information 
by Means of K-calculuses. Textbook. Moscow, The 
Moscow Institute of Electronic Engineering, 1988.  
[13] Fomichov, V. A. (1992). Mathematical models of 
natural-language-processing systems as cybernetic 
models of a new kind. Cybernetica. Quarterly 
Review of the International Association for 
Cybernetics (Belgium, Namur).,Vol. 35,  No. 1, pp 
63–91.  
[14] Fomichov, V. A. (1993). Towards a mathematical 
theory of natural language communication. 
Informatica. An Intern. Journal of Computing and 
Informatics (Slovenia), ol. 17, No. 1, pp. 21–34.  
[15] Fomichov, V. A. (1993). K-calculuses and K-
languages as powerful formal means to design 
intelligent systems processing medical texts. 
Cybernetica (Belgium), 993, Vol. XXXVI, No. 2, 
pp.161-182. 
[16] Fomichov,  V. A. (1994). Integral Formal Semantics 
and the design of legal full-text databases. 
Cybernetica (Belgium), ol. XXXVII, No. 2, pp. 145-
177. 
[17] Fomichov, V. A. (1996). A mathematical model for 
describing structured items of conceptual level.  
SK-languages as a Powerful and Flexible Semantic... Informatica 41 (2017) 221–232 231 
Informatica. An International Journal of Computing 
and Informatics (Slovenia), Vol. 20, No. 1. pp. 5–32. 
[18] Fomichov V. A. (1998). Theory of restricted K-
calculuses as a comprehensive framework for 
constructing agent communication languages. 
Fomichov, V.A., Zeleznikar, A.P. (eds.), Special 
Issue on NLP and Multi-Agent Systems. 
Informatica. An Intern. Journal of Computing and 
Informatics (Slovenia), Vol. 22, No. 4, pp. 451-463. 
[19] Fomichov, V. A. (2000). An ontological 
mathematical framework for electronic commerce 
and semantically-structured Web. Zhang, Y., 
Fomichov, V.A., Zeleznikar, A.P. (Eds.) Special 
Issue on Database, Web, and Cooperative Systems. 
Informatica. An Intern. Journal of Computing and 
Informatics (Slovenia), Vol. 24, No. 1, pp. 39-49.  
[20] Fomichov, V. A. (2002). Theory of K-calculuses as 
a powerful and flexible mathematical framework for 
building ontologies and designing natural language-
processing systems. Andreasen, T., Motro, A., 
Christiansen, H., Larsen, H.L. (Eds.), Flexible Query 
Answering Systems, 5th Intern. Conference, FQAS 
2002, Proceedings, Lecture Notes in Artificial 
Intelligence, Vol. 2522, Springer: Berlin, 
Heidelberg, New York, pp. 183-196. 
[21] Fomichov, V.A. (2005a). The Formalization of 
Designing Natural Language Processing Systems. 
Moscow: MAX Press, 368 p. (in Russian). 
[22] Fomichov, V.A. (2005b). A new method of 
transforming natural language texts into semantic 
representations. Informational Technologies, 
Moscow, No. 10, pp. 25-35 (in Russian). 
[23] Fomichov V. A. (2007). Mathematical foundations 
of representing the content of messages sent by 
computer intelligent agents. Moscow, State 
University – Higher School of Economics, 
Publishing Hiuse “TEIS”, 2007 – 176 p. (in Russian) 
[24] Fomichov, V.A. (2008). A comprehensive 
mathematical framework for bridging a gap between 
two approaches to creating a Meaning-
Understanding Web.  Intern.  Journal of Intelligent 
Computing and Cybernetics,. Vol. 1, No. 1, pp. 143-
163. 
[25] Fomichov, V. A. (2010a) Semantics-Oriented 
Natural Language Processing: Mathematical Models 
and Algorithms, Springer, New York, Dordrecht, 
Heidelberg, London. - 352 p. 
[26] Fomichov, V. A. (2010b) Theory of K-
representations as a comprehensive formal 
framework for developing a Multilingual Semantic 
Web.  Informatica. An Intern.  Journal of Computing 
and Informatics (Slovenia), Vol. 34, No. 3, pp.. 387-
396. 
[27] Fomichov, V. A. (2011). The prospects revealed by 
the theory of K-representations for bioinformatics 
and Semantic Web. Actes de la 18e conference sur le 
Traitement Automatique des Langues Naturels. 
Actes de la 15e Rencontre des Etudiants Cercheurs 
en Informatique pour le Traitement Automatique des 
Langues. France, Montpellie, 27th June - 1st July 
2011 Vol. 1: Actes: articles longs. Montpellier : AVL 
Diffusion, pp 5-20.  
[28] Fomichov, V.A. (2013). A broadly applicable and 
flexible conceptual metagrammar as a basic tool for 
developing a Multilingual Semantic Web. In: Metais 
E., Meziane F., Saraee M., Sugumaran V., Vadera S. 
(Eds.) Natural Language Processing and Information 
Systems. 18th Intern. Conference on Applications of 
Natural Language to Information Systems, NLDB 
2013. Salford, UK, June 2013, Proceedings. Lecture 
Notes in Computer Science,  Vol. 7934, Springer, 
Berlin, Heidelberg, pp. 249-259. 
[29] Fomichov, V. A. (2014) SK-languages as a 
comprehensive formal environment for developing a 
Multilingual Semantic Web. Decker H., Lhotská 
L., Link S., Spies  M., Wagner R.R. (Eds.). Database 
and Expert Systems Applications, 25th Intern.  
Conference, DEXA 2014, Munich, Germany, 
September 1-4, 2014, Part I, Proceedings. Lecture 
Notes in Computer Science, Vol. 8644, Cham: 
Springer International Publishing Switzerland, pp. 
394-401. 
[30] Fomichov, V.A., Kirillov, A.V. (2012). A formal 
model for constructing semantic expansions of the 
search requests about the achievements and failures. 
Artificial Intelligence: Methodology, Systems, and 
Applications, Ramsay A., Agre G. (Eds.), Lecture 
Notes in Computer Science, Vol. 7557, Springer, 
Berlin, Heidelberg, pp. 296–304.  
[31] Gildea, D., Jurafsky, D. (2002). Automatic Labeling 
of Semantic Roles. Computational Linguistics,  Vol. 
28, No. 3, pp. 245 - 288. 
[32] Google Hummingbird (2016); 
https://en.wikipedia.org/wiki/ 
Google_Hummingbird (retrieved 10.11.2016). 
[33] Kingsbury, P., Palmer, M. (2002). From TreeBank to 
PropBank . Proceedings. LREC, 2002. 
[34] Langkilde, I, Knight, K. (1998). Generation that 
exploits corpus-based statistical knowledge. Proc. of 
the 36th Annual Meeting of the ACL and 17th 
International Conference on Computational 
Linguistics, Montreal, pp. 704-710. 
[35] Li, B., Wen, Y., Bu, L., Qu, W., Xue, N. (2016). 
Annotating the Little Prince with Chinese AMRs. 
Proc. of LAW X – the 10th Linguistic Annotation 
Workshop. Berlin, Germany, August 11, 2016, ACL, 
pp. 7-15. 
[36] Liang, P. (2016). Learning executable semantic 
parsers for natural language understanding. 
Communications of the ACM, Vol. 59, No. 9, pp. 68-
76. 
[37] Lu, C., Xu, Y., Geva, S. (2008). Web-based query 
translation for English-Chinese CLIR. 
Computational Linguistics and Chinese Language 
Processing (CLCLP), pp. 61-90. 
[38] Marcus M. P., Marcinkiewicz M. A., Santorini B. 
(1993). Building a large annotated corpus of English: 
the Penn Treebank. Computational Linguistics, Vol. 
19,  No. 2. 
[39] Marquez L., Carreras X., Litkowski K. C., Stevenson 
S. (2008) Semantic Role Labeling: an Introduction to 
232 Informatica 41 (2017) 221–232 V.A. Fomichov  
 
the Special Issue. Computational Linguistics, Vol. 
34, No. 2, pp. 145-159. 
[40] Ontos GmbH company Web-page (2016): 
www.ontos.com. 
[41] Pravikov, A.A., Fomichov, V.A. (2010). 
Development of a recommender system with a 
natural language interface on the basis of semantic 
objects’ mathematical models. Business Informatics. 
Interdisciplinary scientific-practical journal, 
Moscow, State University – Higher School of 
Economics, 2010, No. 4 (14), pp. 3-11. 
[42]  Punyakanok V., Roth D., Yih W. T. (2008). The 
importance of syntactic parsing and inferencing in 
semantic role labeling. Computational Linguistics, 
Vol. 34, No. 2, pp.. 257-287. 
[43] Pust, M., Hermjakob, U., Knight, K. , Marcu, D., 
May, J. (2015). Parsing English into abstract 
meaning representation using syntax-based machine 
translation. In Proc. of the EMNLP 2015, Lisbon, pp. 
1143-1154. 
[44] Razorenov, A. A., Fomichov, V. A. (2014). The 
Design of a Natural Language Interface for File 
System Operations on the Basis of a Structured 
Meanings Model. In Procedia Computer Science, 
Elsevier. V. 31. P. 1005-1011; open access, URL:  
http://authors.elsevier.com/sd/article/S18770509140
05304. 
[45] Razorenov, A. A., Fomichov, V. A. (2016). A new 
formal approach to semantic parsing of instructions 
and to file manager design. In Database and Expert 
Systems Applications, 27th Intern.  Conference, 
DEXA 2016, Porto, Portugal, September 5-8, 2016, 
Part I, Proceedings. Lecture Notes in Computer 
Science. V. 9827. Cham: Springer International 
Publishing Switzerland,  pp. 416-430. 
[46] Rindflesh, T.C., Kilicoglu, H., Fiszman, M., 
Roszemblat, G., Shin, D. (2011). Semantic 
MEDLINE: An Advanced Information Management 
Application for Biomedicine. Information Services 
and Use, IOS Press Vol. 1, pp. 15-21. 
[47] Sawai, Y., Shindo, H., Matsumoto, Y. (2015). 
Semantic structure analysis of noun phrases using 
abstract meaning representation. Proc. of the 53rd 
Annual Meeting of the ACL (Volume 2: Short 
papers), Beijing, pp. 851-856. 
[48] Schubert, L.K., Hwang, C.H. (2000). Episodic Logic 
meets little red riding hood: A comprehensive, 
natural representation for language understanding. In 
Iwanska, L.,  Shapiro, S.C. (eds.), Natural Language 
Processing and Knowledge Representation: 
Language for Knowledge and Knowledge for 
Language, MIT/AAAI Press, Menlo Park, CA, and 
Cambridge, MA, pp. 111-174. 
[49]  Stellato, A. (2016). A language-aware Web will 
give us a bigger and better Semantic Web. MSW 
2015. Multilingual Semantic Web. Proc. of the 
Fourth Workshop on the Multilingual Semantic Web 
(MSW4) co-located with 12th Extended Semantic 
Web Conference (ESWC 2015), Portoroz, Slovenia, 
June 1, 2015, pp. 1-14. 
[50] Sullivan, D. (2013). FAQ: All about the new Google 
“Hummingbird” algorithm. Search Engine Land, 26 
September 2013, 
http://searchengineland.com/google-hummingbird-
172816 (retrieved 10.11.2016). 
[51] Uchida, H., Zhu, M., Della Senta, T. (1999). A Gift 
for a Millennium.. 
[52] Uren, V.S., Lei, Y., Motta, E.: (2008). SemSearch: 
Refining Semantic Search. In: Bechhofer, S., 
Hauswirth, M., Hoffman, J., Koubarakis, M. (Eds.), 
ESWS 2008, LNCS, vol. 5021. Springer, Heidelberg, 
pp. 874-878. 
[53] Vogt, M. (2016). How Natural Language Processing 
will change the Semantic Web. Semantics, April 13, 
2016; https://2016.semantics.cc/how-natural-
language-processing-will-change-semantic-web 
(retrieved 12.09.2016). 
[54] Wang, C., Xue, N., Pradhan, S. (2015). Boosting 
transition-based AMR parsing with refined actions 
and auxiliary analyzers. In Proc. of the 53rd Annual 
Meeting of the ACL (Volume 2: Short papers), 
Beijing, pp. 857-862. 
[55] Werling, K., Gabor, A., Manning, C.D. (2015). 
Robust subgraph generation improves abstract 
meaning representation parsing. In Proc. of the 53rd 
Annual Meeting of the ACL (Volume 1: Long 
Papers), Beijing, pp. 982-991. 
[56] Wilks, Y., Brewster, C. (2006). Natural Language 
Processing as a Foundation of the Semantic Web. 
Foundations and Trends in Web Science, Vol. 1, No. 
3. Hanover, MA; Delft:  now Publ. Inc. 
[57]  Wintner S. (2009). What science underlies natural 
language engineering? Computational Linguistics, 
Vol. 35, No. 4, pp. 641-644. 
[58] Yi ,S, Loper., E., Palmer, M. (2007) .Can semantic 
roles generalize across genres? Proceedings of the 
Human Language Technologies Conference of the 
North American Chapter of the ACL. Rochester, NY, 
pp. 548-555. 
 
 Informatica 41 (2017) 233–252 233 
Formal Development of Multi-Agent Systems with FPASSI: Towards 
Formalizing PASSI Methodology using Rewriting Logic  
Mihoub Mazouz 
Department of Mathematics and Computer Science, RELA(CS)2 Laboratory 
University of Larbi Ben M’Hidi, Oum El Bouaghi, Algeria  
E-mail: mazouz_mihoub@hotmail.fr 
 
Farid Mokhati 
Department of Mathematics and Computer Science, RELA(CS)2 Laboratory 
University of Larbi Ben M’Hidi, Oum El Bouaghi, Algeria  
E-mail: mokhati@yahoo.fr 
 
Mourad Badri 
Department of Mathematics and Computer Science, Glog Laboratory 
University of Quebec, Trois-Rivières, Canada 
E-mail: Mourad.Badri@uqtr.ca 
 
Keywords: formal development of MAS, PASSI, validation, verification, rewriting logic, maude, maude-strategy,    
model-to-text transformation 
Received: June 20, 2016 
 
Agent technology has proved its ability and efficiency in modelling complex distributed applications. 
During the last two decades, several MAS development methodologies have been proposed like, for 
instance, Gaia, Tropos and PASSI. Although these methodologies have made significant contributions to 
meet several challenges in the MAS development field, most of them do not use formal techniques. Formal 
methods, as it is well known, play a significant role in developing more reliable and robust MAS.  This 
paper presents the Formal-PASSI methodology. Formal-PASSI is an extension of the well-known PASSI 
methodology. The extension consists mainly of the integration of a new formal model to the design process. 
The new model is based on the Maude language and its extension Maude-Strategy. It aims at offering a 
formal description of the MAS under development by a Model-to-Text transformation. The generated 
formal description is then used to validate some PASSI behavioural diagrams and check properties of 
both single & multi-agent abstraction levels before passing to the code model. The integration of formal 
methods into PASSI design process seems a good way to ensure the development of high quality agent-
based applications. The proposed approach is supported by a tool (F-PTK) that we have developed and 
illustrated throughout the ATM case study. 
Povzetek: V članku je predstavljena formalna PASSI MAS metodologija, tj. multi-agentna metodologija. 
1 Introduction 
Current computing systems became increasingly complex 
with high safety requirements. Agent technology has 
proved its ability and efficiency in modelling complex 
distributed applications. As well as any other technology, 
the emergence of the agent technology pushes the research 
community to propose new methodologies, languages and 
tools to support it and to enable a wider spread in the 
industry sector. Many methodologies like PASSI [1,2], 
Gaia [3,4], ADELFE [5,6,7], Prometheus [8], Tropos [9] 
and INGENIAS [10] have been proposed to facilitate and 
to assist the development of Multi-Agent Systems (MAS). 
Although these methodologies have made real progress in 
the MAS development field, proposing new 
methodologies that assist agent-based systems 
development is still insufficient for industrial adoption 
[11].  
The development of such systems requires solid bases 
in terms of specification. Existing methodologies use 
abstract and/or semi-formal specifications. Although such 
types of specifications offer several advantages such as the 
readability and the facility of comprehension, they have 
drawbacks like ambiguity and inconsistency, which are 
manually difficult to detect. However, formal 
specifications face these drawbacks and enable the 
description of the system under development in a precise 
and unambiguous way.  Using formal methods is essential 
to produce high quality agent-based systems at the end of 
the development process. In particular, integrating formal 
methods into the development process of MAS 
methodologies leads to the production of reliable systems.  
In order to overcome the problems quoted above, 
many proposals are trying to use formal methods in agent-
234 Informatica 41 (2017) 233–252 M. Mazouz et al. 
oriented software engineering (AOSE) (see Section 2). 
However, most of them present several limitations, 
especially; they do not use formal methods within an 
entire design process. Moreover, many of them are not 
supported by adequate tools.  
PASSI (Process for Agent Societies Specification and 
Implementation) [1, 2] is a step-by-step requirement-to-
code methodology for designing and developing agent-
oriented systems that integrates concepts from both 
Object-Oriented Software Engineering (OOSE) and MAS  
using UML (Unified Modelling Language) notation. 
PASSI covers almost of development process stages, and 
can be used to assist the development of general-purpose 
agent-oriented systems although it has evolved from a 
long period experiment to the development of embedded 
robotics applications [12]. However, being PASSI based 
on a semi-formal language such as UML makes the 
validation and verification activities less efficient. 
In this paper, we propose F-PASSI (Formal-PASSI), 
a formalization of the PASSI methodology by adding a 
new formal model into its design process. The extension 
is based on rewriting logic [13, 14] and particularly the 
Maude language [15,16] (and its extension Maude-
Strategy [17]). The integrated model aims at offering a 
Maude-based formal description of the MAS under 
development to enrich the semantic of its UML-based 
design. The produced formal description is then exploited 
to validate PASSI behavioural diagrams (some of them 
until now) by formal simulation thanks to Maude, and 
Maude LTL model-checker [18] in order to verify system 
properties in both single/multi agent abstract levels. A tool 
was developed to support our approach.   
The remainder of this paper is organized as follows: 
In section 2, we give an overview of major related works. 
In section 3, we give a brief description of rewriting logic 
as well as Maude language (and its extension Maude-
Strategy). In section 4, a brief description of the PASSI 
methodology is given. We introduce, in section 5, the 
proposed formal extension for PASSI. Our developed tool 
is shown in section 6. In section 7, the ATM case study is 
used to illustrate our approach. Finally, section 8 gives 
some conclusions and future work directions. 
2 Related works 
Using formal methods in multi-agent systems 
development is a challenge raised by many researchers in 
MAS area. El Fallah-Seghrouchni et al. have presented a 
classification of the proposed works on formal 
development of MAS [19]. According to the authors, three 
alternatives can be captured from the literature: (A) 
Formal derivation: which is a kind of model-to-code 
transformation and aims at realizing MAS based on a 
given specification. (B) Enhancement of an existing 
methodology by integrating formal meanings to its design. 
(C) Proposing a new one. The fact that our work can be 
considered as an integration of formal methods to an 
existing methodology, PASSI, makes our focus in this 
section on works belonging to the second category.  
                                                          
1 http://staruml.io 
In [20, 21], Ball et al. have presented an incremental 
development process using Event-B [22] for multi-agent 
systems. The proposed process can be divided into two 
stages. In the first one, informal models based on agent 
concepts are constructed. In the second stage, based on the 
informal models, the Event-B models are constructed by 
the developer, which is provided by guidance to make the 
transformation from informal design to formal models 
straightforward. The constructed Event-B models are 
refined and decomposed into specifications of roles. In 
[23], a set of modelling patterns providing fault-tolerance 
in Event-B models of multi-agent interactions are 
presented. Another work proposing a new formal 
methodology is ForMAAD [24, 25]. ForMAAD is a 
model driven approach for designing agent-based 
application. It uses Agent Modelling Language (AML) 
[26] to model architectural and behavioural concepts 
associated with multi-agent systems; and Temporal Z [27] 
to guarantee a formal verification of the models. 
Extensions of StartUML1 tool are made to support the 
models they proposed.  
Two works using formal methods for the Tropos 
methodology [9] can be emphasized here. First, Fuxman 
et al. [28] have proposed an extension of Tropos, Formal 
Tropos, with a formal specification of early requirements. 
For that, Formal Tropos language is defined by integrating 
the primitive concepts of Tropos with a temporal 
specification language inspired by KAOS [29]. After the 
translation (using the implemented T-tool2) of the 
requirements specification written by the analyst into an 
intermediate language, an enhanced version of NuSMV 
model checker [30] performs consistency checking (“the 
specification admits valid scenarios”), possibility 
checking (“there is some scenarios for the system that 
respect certain possibility properties”) and assertion 
validation (“all scenarios for the system respect certain 
assertion properties”). Secondly, in [31], a mapping of 𝛽-
Tropos concepts [32] into the computational logic-based 
framework SCIFF [33] is defined and important formal 
properties (soundness, completeness and termination) are 
identified and discussed. The formal specifications are 
verified using SCIFF engine. Instead of writing it 
manually, as in the last works, the formal specification is 
produced in a systematic way in Formal-PASSI thanks to 
F-PTK (Formal-PASSI Tool Kit), the tool we have 
developed, this makes it, unlike Formal Tropos, less based 
on the subjective judgment of the developer. Also, in 
Formal-PASSI, the formal specification combines, in 
addition to the domain knowledge, the structure and 
behaviour of agents composing the MAS to be exploited 
later to validate and verify its correctness.  
Instead of proposing new formal methodologies for 
MAS development or enhancing existing ones, other 
researchers have used formal methods, separately from 
any methodology, for particular design aspects. Fadil et al. 
[34] have used the B method [35, 36] to formally model 
interactions between agents in order to check and then 
prove the initial UML specification. The approach was 
2 http://disi.unitn.it/~ft/ft_tool.html 
Formal Development of Multi-Agent Systems with... Informatica 41 (2017) 233–252 235 
illustrated using Contact-Net protocol as a case study. 
Jemni Ben Ayed et al. [37] have presented a specification 
and verification technique for interaction protocols in 
MAS by combining AUML (Agent UML) [38] and Event-
B method [22]. In their technique, the interaction protocol 
is modelled in an AUML protocol diagram and translated 
in Event B. The required IPs (Interaction Protocols) safety 
and liveliness properties are added to the derived 
specification for verification using the B4free tool1.  
As B method, the Z language [39] and its extension 
Temporal Z [27] have been the subject of many works. In 
[40], the authors have presented a formal approach using 
Temporal Z in two phases. In the specification phase, user 
requirements are described in an abstract way avoiding the 
description of implementation details. Then, based on a 
succession of refinements, the design phase aims at 
inventing a set of inter-agent (collective) behaviours as 
well as intra-agent (individual) behaviours, which have to 
satisfy the user requirements. Other works address the use 
of formal methods in runtime to verify some properties 
(that are not verifiable in design phase) as in [41], where a 
JADE-based formal verification methodology for MAS in 
semi-runtime approach has been proposed. The proposed 
verification process used timed trace theory to detect time 
constraint failures.  
Lapouchnian et al. [42] have proposed a combined 
agent-oriented requirements engineering approach using 
informal i* [43] models, ConGolog [44] and (its 
extension) CASL [45] formal specifications. Social 
dependencies between agents are modelled using the i* 
framework. This framework is used to perform an analysis 
of opportunities and vulnerabilities. The models are 
gradually made more precise by using annotated models 
(Annotations are introduced in [46] and extended in [47]). 
After that, complex processes can be formally modelled 
using ConGolog or CASL with subsequent verification or 
simulation.  
In [48], the authors have presented an extension of G-
net formalism [49] (a type of high level Petri net) called 
Agent-oriented G-net to serve as high level design of 
intelligent agents by means of their internal states, their 
environment, their interactions, etc. Based on this high 
level design, agent architecture and detailed design for 
agent implementation can be derived using the ADK tool 
they developed. Stamatopoulou et al. [50] have presented 
an open framework facilitating formal modelling of multi-
agent systems called OPERAS by employing two existing 
formal methods: X-machines [51] and PPS (Population P 
Systems) [52]. By using this framework, agent’s 
behaviour can be formally modelled and controlled over 
its internal states, as well as the mutations that occur in the 
structure of a MAS. The authors have applied the 
framework to swarm systems.  
Compared to the works discussed above, the approach 
we propose:  (1) integrates formal methods, not separately 
from any methodology, but into an entire design process 
(PASSI design process), (2) is based on a powerful formal 
language, Maude, which offers many tools as Maude LTL 
model checker [18], (3) checks the specified properties 
                                                          
1 http://www.b4free.com 
before passing to code details, (4) is supported by a tool 
(F-PTK) which offers many services such as automating 
the production of the Maude-based formal description of 
the MAS under development by means of its structure 
(Agents, roles, tasks, action tasks) and the domain 
knowledge.  
3 Rewriting Logic, Maude & 
Maude-Strategy 
3.1 Rewriting Logic 
The rewriting logic was introduced by Jose Meseguer [13, 
14] to describe concurrent systems. It makes it possible to 
think in a correct manner on the concurrent systems 
having states and evolving in terms of transitions. Indeed, 
the rewriting logic unifies several formal models which 
express concurrency as labelled transition systems [53], 
Petri nets [54] and CCS [55]. The basic statements of this 
logic are called rewriting rules and have the form:  t → t'  
if C, where t and t' are algebraic terms describing a partial 
state of the concurrent system. A rewriting rule, in this 
case, describes a change of a partial state towards another 
if a certain condition C is true. Formally, a theory of 
rewriting is a triplet R = (∑,E,R) where: 
 (∑, E) an equational theory with function 
symbols ∑ and equations E;    
 R a set of labelled rewrite rules. These rules are 
of the form: 𝑡 → 𝑡′ (unconditional rewriting rules) 
or 𝑡 → 𝑡′ 𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛 (conditional rewriting 
rules). 
The unconditional rewriting rules indicate that: the term 𝑡 
becomes 𝑡′, but, the conditional rewriting rules indicate 
that: 𝑡 becomes 𝑡′ if a certain condition is true. A theory of 
rewriting has a set of inference rules [13, 14]: 
 Reflexivity:  For each [t] ∈ T∑, E (X), 
[𝑡]→[𝑡′]
 
 Congruency: For each 𝑓 ∈  ∑ 𝑛, 𝑛 ∈  𝑁  
[𝑡1] → [𝑡′1] … [𝑡𝑛] → [𝑡′𝑛]
[𝑓(𝑡1, … , 𝑡𝑛)] → [𝑓(𝑡′1, … , 𝑡′𝑛)]
 
 Replacement: For each rewriting rule:  
r: [t(x1,…, xn)] → [t’(x1,…, xn)] in R, 
[𝑤1] → [𝑤1
′] … [𝑤𝑛] → [𝑤𝑛
′ ]
[𝑡(?̅? 𝑥⁄ )] →  [𝑡′(?̅?′ 𝑥⁄ )]
 
Such as 𝑡(?̅? 𝑥⁄ ) indicates the simultaneous 
substitution of wi for xi in t. 
 Transitivity: 
[𝑡1]→[𝑡2]    [𝑡2]→[𝑡3]
[𝑡1]→[𝑡3]
.  
Figure 1 visualizes each one of these rules.  
236 Informatica 41 (2017) 233–252 M. Mazouz et al. 
Figure 1: Visualization of inference rules of a rewriting 
theory [14]. 
Among the languages implementing the rewriting 
logic, we quote CafeOBJ1 [56] and Maude [15, 16]. 
3.2 Maude language 
Defined by J. Meseguer, the Maude language [15, 16] is 
one of the most powerful implementations of the rewriting 
logic. Maude is a high level, very powerful, declarative 
language for the construction of the various kinds of 
applications based on both equational and rewriting 
logics. It offers few syntactic constructions and well-
defined semantics. The basic unit of specification and 
programming in Maude is the module. In fact, there are 
three types of modules: 
Functional modules: Define the sorts of data and the 
operations on these data through equational theories. The 
sorts of data are composed of elements which can be called 
by terms. A functional module is declared according to the 
following syntax:  
fmod MODULE-NAME is 
    …  
endfm 
System modules: Specify a rewriting theory. A system 
module has sorts, operations and can have equations and 
rewriting rules, which can be conditional. A System 
module is declared as follows: 
mod MODULE-NAME is 
    … 
endm 
The addition that a system module offers (compared to a 
functional module), is the ability of specifying rewriting 
rules. The unconditional rules are declared as follows:    
rl [<Label>] : <Term1> => <Term2> . 
The conditional rewriting rules can have very general 
conditions implying equations and other rewritings. In 
their representation in Maude, the conditional rules are 
declared as follows: 
crl [<Label>] : <Term1> => <Term2> 
 if <Condition-1> and .. and <Condition-k> . 
Object-oriented modules: Compared to system modules, 
object-oriented modules offer a more suitable syntax to 
                                                          
1 https://cafeobj.org/ 
describe the basic entities of the object paradigm as, 
among others: classes, objects, messages and 
configurations. An object-oriented module is declared 
according to the following syntax: 
omod MODULE-NAME is    
     …   
endom 
Figures 2, 3 and 4 show an example for each Maude’s 
module type (independent modules).  
Figure 2: Example of a functional module. 
Figure 3: Example of a system module [16]. 
Figure 4: Example of an Object-oriented module. 
It is important to note that there exist two separated 
levels in the current version of Maude (Maude 2.7): Core 
Maude and Full Maude. 
Core Maude: It is the basic level of Maude, programmed 
directly in C++. It implements all the basic functionalities 
of the language, the functional modules and the system 
modules; 
Full Maude: Full Maude is the higher level. Programmed 
in Core Maude, it is actually used with object-oriented 
programming paradigm and using it offers the possibility 
of using object-oriented modules. All commands and 
modules in Full Maude must be declared between 
brackets.  
fmod COORD-COMPLEX-TYPE is 
 inc FLOAT . 
 inc BOOL . 
 sort Coord . 
 op _;_ : Float Float -> Coord . 
 op empty : -> Coord . 
 op getLatitude : Coord -> Float . 
 op getLongitude : Coord -> Float . 
 op equals : Coord Coord -> Bool . 
 vars Lat Lon x1 y1 x2 y2 : Float . 
 eq getLatitude ( Lat ; Lon ) =  Lat .  
 eq getLongitude ( Lat ; Lon ) = Lon .  
 eq equals ( x1 ; y1  , x2 ; y2 ) = if (x1 == x2) and  
(y1 == y2) then true else false fi .   
endfm 
mod BB-TEST is 
 sort Expression . 
 ops a b bingo : -> Expression . 
 op f : Expression Expression ->          
                             Expression . 
 rl a => b . 
 rl b => a . 
 rl f(b, b) => bingo . 
endm 
omod ACCOUNT-CONCEPT is 
  pr STRING . 
    
  class Account | accountN : String, owner : Oid .  
endom 
Formal Development of Multi-Agent Systems with... Informatica 41 (2017) 233–252 237 
Among Maude’s characteristics that justify our choice of 
Maude, we quote: 
 Easy: Programming with Maude is easy because it is 
a declarative language and it offers few and very 
simple syntactic constructions which are easy to be 
understood; 
 Having a strong semantics: Based on a solid logic: 
rewriting logic; 
 Expressive: Determinist and concurrent, non-
determinist calculations can be expressed easily 
respectively by equations in functional modules and 
rewriting rules in system modules in Maude.     
 Wide spectrum: It supports the formal specification, 
prototyping and concurrent programming. 
 Multi-Paradigm: It combines functional, 
concurrent and object paradigms. 
 Executable: A Maude specification is directly 
executable; 
 Equipped with many tools: It offers to its users a 
set of tools1 like: Declarative Debugger2, the Anima 
tool3 and The Maude LTL model checker [18]. 
Also, many extensions of Maude are developed as Real-
Time Maude [57], and Maude Strategy Language [17]. 
3.3 Maude-Strategy 
The Maude-Strategy [17] is an extension of Maude 
Language written in Maude itself. It was defined in order 
to explicitly control the way in which the rewriting rules 
are applied. The originality of Maude-Strategy language is 
to make it possible to specify the strategy of applying the 
defined rewriting rules, which makes it possible to clearly 
separate the transformation rules and their control. When 
we don’t have such a language that can specify strategies 
controlling the order of applying the rewriting rules 
separately, the order of their application is often coded in 
the rewriting rules themselves, which makes more 
complex and less readable the program to be written. The 
treatment and control operations are mixed. The strategies 
are defined by using the modules of strategies.  
It is possible to define many modules of strategies for 
only one system module (or object-oriented module) in 
order to express the various possible forms of rewritings. 
A Strategy E is described as an operation that, when it is 
applied to a given term t, produces consequently a set of 
terms (Eventually empty): 
_@_ : Strat × 𝑇∑(X) → P(𝑇∑(X)) 
This operation is extended to sets of terms so that:  
if T⊆ P(𝑇∑(X)) and E ∈ Strat then E @ T = ⋃ 𝑆 @ 𝑡 .𝑡∈𝑇  
For space reason, only a subset of strategies [17] is 
described: 
Identity and Failure (Idle and fail): The first two basic 
strategies are the identity and the failure, defined by Idle 
and fail. The application of the identity strategy turns over 
the unchanged term:   
 Idle @ t = {t} 
                                                          
1http://maude.cs.illinois.edu/w/index.php?title=Maude_Tools 
2 http://maude.sip.ucm.es/debugging/ 
The application of the strategy failure turns over the empty 
set as a result:  
Fail @ t = ∅ 
Elementary strategies: Starting from the labels of rules, it 
is thus possible to build strategies, which turn over one or 
more results, to schedule the application of the rules and 
to repeat as a long time as possible the application of a rule 
or a strategy. A labelled rule is thus regarded as an 
elementary strategy and the result of the application of a 
labelled rule L on a term t turns over the set reached terms 
by applying the rule L. If no rule labelled by L can be 
applied, it is said that the strategy failed. 
Regular expressions: The expression of elementary 
strategies can be combined by using operators of 
concatenation (;), of union (|), of iteration (E* for zero or 
more iterations, E+ for one or more iteration). 
op _;_ : Strat Strat -> Strat [assoc] . 
op _|_ : Strat Strat -> Strat [assoc comm] . 
op _* : Strat -> Strat . 
op _+ : Strat -> Strat . 
The application of the concatenation (;) of two strategies 
E and E’ on a term t has as a result all the results of 
application of E’ on the whole of all results of the 
application of E on t: 
[(E ; E’) @ t] = [E’ @ [E @ t] 
On the other hand, the application of the union (|) of two 
strategies E and E’ on a term t has as a result the whole of 
results of application of both E and E’ separately on the 
term t: 
[(E | E’) @ t] = [E @ t] ∪ [E’ @ t] 
The operators of iteration (E* and E+) are used to define 
strategies, which concatenate successively the same 
strategy:    [E+ @ t] = ⋃ [𝐸𝑖  @ 𝑡]𝑖≥1    where  E1 = E and 
En = (E ; En−1) for n > 1. 
[E*@ t] = [(idle | E+) @ t]. 
The operators of iteration (E* and E+) are used to define 
strategies, which concatenate successively the same 
strategy:    [E+ @ t] = ⋃ [𝐸𝑖  @ 𝑡]𝑖≥1         where  E1 = E 
and En = (E ; En−1) for n > 1. 
[E*@ t] = [(idle | E+) @ t]. 
 
Conditional strategies: Moreover, Maude-Strategy 
defines operators of choice which take the following 
general form:     if E then E’ else E’’  (where E ? E' : E'') 
This form when it is applied on a term t acts like the 
following:  The strategy E is initially applied to t, if E is 
evaluated successfully (E @ t ≠  ∅), the strategy E' is 
applied on the set of terms which results from the 
evaluation of E, if not E @ t =  ∅, E'' is applied to the 
initial term t. Among the derived operators from this 
general form, we distinguish the operator orelse which 
acts like the following: When E is applied successfully, 
3 http://safe-tools.dsic.upv.es/anima/ 
238 Informatica 41 (2017) 233–252 M. Mazouz et al. 
the result is obtained, but if that is failed, then E' is applied 
to the initial term. In other words: E orelse E' = if E then 
idle else E'. 
Figure 5 illustrates a Strategy module. In this module, 
four strategies are identified: Branch0, Branch1, Branch2 
and Protocol. For example, the strategy “Branch1” 
specifies that the rewriting rule labelled by “Send-The-
Nearest-Ambulance” must be applied in parallel with the 
sequence of the two rules labelled respectively by 
“Transformation-1” and “Send-Police-Patrol”. 
Figure 5: Example of a strategy module. 
4 PASSI Methodology 
PASSI (Process for Agent Societies Specification and 
Implementation) [1, 2], is a step-by-step requirement-to-
code methodology for designing and developing agent-
oriented systems. It integrates concepts from both OOSE 
and artificial intelligence approaches using UML notation. 
It refers to the most diffused standards: (A)UML, FIPA, 
JAVA, RDF and it is composed of a complete incremental 
and iterative design process and modelling language that 
is an extension of UML. PASSI is based on reuse that is 
performed through design patterns and supported by the 
PTK tool (PASSI Tool Kit) [58]:  
Figure 6: PASSI methodology [1, 2]. 
As Figure 6 shows, PASSI methodology is composed 
of five models and a test activity, each model contains one 
or more phases: 
 System Requirements Model: It is composed of four (4) 
phases. The functional requirements of the MAS are 
captured through a use case diagram (Domain Description 
phase). The agents carrying out these requirements are 
then identified (Agents Identification phase) via packaged 
use case diagram where each agent (package) is 
responsible of one or more requirement(s) (use case(s)). 
Roles played by agents in different scenarios are identified 
(Roles Identification phase) using sequence diagram 
where each life line signifies one played role by following 
the syntax: <Role> : <AgentName>  and each scenario is 
explored by one sequence diagram. Finally, the 4th phase 
(Tasks specification) aims at specifying the different tasks 
of an agent and the relationships between them (internal 
tasks) and other agent’s tasks (external tasks). 
 Agent Society Model: It is composed of four (4) phases. 
The knowledge (about the domain) of agents composing 
the system is described through an ontology (Concepts, 
predicates and actions) that is specified by a class diagram 
(Domain Ontology Description phase). Communications 
between agents are described also by a class diagram 
(Communication Ontological Description phase) where 
agents are specified by classes and each communication 
between two agents is specified by an association class 
(with three attributes Ontology, Language, Protocol). The 
identified roles are described (Roles Description phase) by 
means of their own tasks, one class for each role, one 
operation for each task, and one package for each agent. 
Roles can be connected by relationships of type: 
[ROLE_CHANGE], [SERVICE_DEPENDENCY] or 
[RESSOURCE or COMMUNICATION_AVAILABILITY]. 
If the protocols used during communications are not 
standard, they will be specified via AUML sequence 
diagram (Protocols Description phase).  
Agent Implementation Model: It is composed of two (2) 
phases. In this model, the structure of the system 
(Multi/Single-agent Structure Definition phase) is defined 
using a class diagram showing all agents composing the 
system by classes and theirs tasks by operations (for multi-
agent point of view) and showing tasks by classes and their 
actions by operations (for single-agent point of view). The 
behaviour of the system (Multi/Single-agent Behaviour 
Description phase) is described by a specific activity 
diagram for multi-Agent point of view, and by state 
machine or other formalisms as flow charts for single-
agent behaviour description [1].  
Code Model: It is composed of two (2) phases. In the first 
phase (Code Reuse), design patterns already developed 
can be used directly. In the second phase (Code 
Production), a skeleton of the system’s source code is 
automatically generated by the PTK and a manual 
completion of the generated code is then achieved by the 
developer. 
Deployment Model: It is composed of one phase 
(Deployment Configuration). A deployment diagram is 
used to describe the allocation of agents to different 
processing units and any constraints on agent migration 
and mobility.  
(smod BUISNESS-PROCESS-PROTOCOL is 
 
  strat Branch0 : @ Configuration . 
  sd Branch0 := ( First-Order ; Is-Reported ) . 
  strat Branch1 : @ Configuration . 
  sd    Branch1 := ( Send-The-Nearest-Ambulance |  
               ( Transformation-1 ; Send-Police-Patrol )  )! . 
  strat Branch2 : @ Configuration . 
  sd   Branch2 := ( Book-The-Nearest-Hospital ;  
                      Branch1 ; Mark-Accident-As-Reported ) . 
  strat Protocol : @ Configuration . 
  sd    Protocol := ( Branch0 ; ( Transformation-2 orelse  
                             Branch2 ) ) . 
endsm) 
Formal Development of Multi-Agent Systems with... Informatica 41 (2017) 233–252 239 
Test activity is divided into two different levels: 1) 
single-agent test: when a framework built on top of JADE 
is implemented [59]. The principal framework classes are: 
“Test” class for testing a specific task of an agent; 
“TestGroup” class for testing all tasks composing a 
specific agent. 2)  society test: at this level, integration 
verification is carried out together with the validation of 
the overall results of the current iteration [1]. 
The meta-model adopted for PASSI MAS is divided 
into three areas [2]: (1) Problem domain: where the 
elements describing the requirements that will be achieved 
by the future system are included. These elements are 
directly connected to the System Requirements Model. (2) 
Agency domain: where the elements describing the multi-
agent society in terms of environment (defined by a set of 
ontological elements) and the social aspect of agents 
(interaction between them) are included. The items of this 
area are connected directly to the Agent Society Model. 
(3) Solution domain: where the elements describing the 
architectural solution (respecting the architecture of FIPA) 
of the problem in terms of agent classes, task class, agent 
code and task code are included. The elements of this area 
are connected directly to the two models: Agent 
Implementation and Code.  
5 Formal PASSI 
The PASSI methodology is based on a semi-formal 
notation (UML). This makes the designed diagrams prone 
of containing incoherencies or inconsistencies and makes 
the testing activity less efficient. We have proposed 
Formal PASSI (see Figure 7), an extension of PASSI 
methodology, in order to formalize its diagrams, and to 
give the designer the ability to apply some formal 
techniques such as model-checking on the formal 
specification. As Figure 7 shows, a new model (Formal 
Model, in yellow color) is integrated in PASSI design 
process. The formal model is based on the rewriting logic 
and its Maude language (and its extension Maude-
Strategy). It aims at offering a formal description of the 
MAS under development. This formal description is then 
exploited to apply formal validation and verification. 
Formal Model is composed of four (4) phases: 
Figure 7: Formal PASSI methodology.
240 Informatica 41 (2017) 233–252 M. Mazouz et al. 
5.1 Formal Description Production 
In this phase, a Maude specification is generated from 
some PASSI diagrams: Domain Ontology Description, 
Roles Description, Single-Agent Structure Definition, 
Multi-Agent Behaviour Description and Single-Agent 
Behaviour Description. In the end of this phase, a Maude 
formal description that covers the agent’s shared 
knowledge (domain ontology), the structure and the 
behaviour of the system in both multi/single abstraction 
levels will be available to be exploited in the next phases. 
The Generation is considered as a Model-to-Text 
transformation and automatically performed thanks to      
F-PTK (see section 6). Figure 8 shows the generated 
modules 
The Domain Ontology Description diagram is 
represented formally in Maude as follows: 1) A concept 
having the name “ConceptName” is translated as a class 
defined in an object-oriented module with the name 
“CONCEPT-NAME-CONCEPT”. 2) A predicate having 
the name “PredicateName” is translated as a class defined 
in an object-oriented module with the name 
“PREDICATE-NAME-PREDICATE”. 3) An action 
having the name “ActionName” is translated as a class 
defined in an object-oriented module with the name 
“ACTION-NAME-ACTION”. All the modules 
representing the ontology elements (concepts, predicates 
and actions) are imported in a functional module called 
“DOMAIN-ONTOLOGY-DESCRIPTION”. 
According to PASSI terminology, an agent-based 
application is composed of agents, agents play roles, roles 
consist of tasks and tasks consist of actions. Table.1 
represents the basic concepts that PASSI methodology is 
based on (TASK, ROLE and AGENT) and their 
representation in Maude. 
Figure 9 shows the functional module AGENTS-STATES 
which defines the sort AgentState representing an agent 
state, and defines two (2) operators: Created and 
Initialized representing the common states for all agents. 
Figure 9: AGENTS-STATES module. 
 
 
 
 
 
 
 
(fmod AGENT-STATES is 
     sort AgentState . 
     *** Commun agent states 
     ops Created Initialized : -> AgentState .  
endfm) 
 
Figure 8: Generated modules. 
Formal Development of Multi-Agent Systems with... Informatica 41 (2017) 233–252 241 
 Table 1: PASSI basic concepts and their Maude representations.
A Single Agent Structure Definition diagram of an agent 
called “AgentName” is represented formally in Maude by 
an object-oriented module with the name “AGENT-
NAME-SINGLE-AGENT-STRUCTURE-DEFINITION”. 
Modules representing all the roles played by such agent, 
must be imported, also, all modules representing tasks 
composing a role must be imported in the role module. All 
states defined in the Single Agent Behaviour Description 
diagram for an agent “AgentName” are represented in a 
functional module with the name “AGENT-NAME-
AGENT-STATES”. 
As mentioned in [2], TaskActions in Multi-Agent 
Behaviour Description diagram are related by Invocation, 
Done, NewTask and Message relationships. These 
relationships are defined in the M-A-B-D-
RELATIONSHIPS (see Figure 10). Besides these 
relationships, the module defines types as: 
OntologyElement that can be a concept, a predicate or an 
action; Performative that signifies the communication 
performative mentioned in a message relationship. In 
order to express that a Task has been instantiated,                  
a TaskAction has been executed and that a message has 
been sent, we have defined respectively three messages 
TaskInstantiated, TaskActionExecuted and MessageSent. 
The FinalState message expresses a final state of a 
scenario. 
The Multi-Agent Behaviour Description diagram is 
translated in Maude by the object-oriented module 
“MULTI-AGENT-BEHAVIOUR-DESCRIPTION”. In 
this module, all modules representing the structure of 
agents as well as all functional modules representing their 
states in addition to the M-A-B-D-RELATIONSHIPS 
module are imported.  
All execution paths of MABD diagram are 
automatically captured (thanks to F-PTK tool) and 
represented as strategies thanks to Maude-Strategy in a 
strategic module with the name “MULTI-AGENT-
BEHAVIOUR-DESCRIPTION-PATHS”.  
 
 
 
PASSI 
Basic  
Concepts 
Maude Representation Description 
Task 
(Super 
class) 
(omod PASSI-TASK is 
   inc STRING . 
   class Task | superClassTaskName : 
String . 
   op noneTask : -> Cid . 
   op noneAction : -> Msg . 
   *** JADE commun methods for all      
   ***subclass Tasks 
   msgs action done : ParametersList  
    -> Msg .  
endom) 
This module defines the Task class that represents the task 
concept. As tasks will be interpreted next, in code level, by 
behaviours (according to JADE framework), the 
superClassTaskName attribute expresses the type of the 
behaviour (like, for instance, OneShotBehaviour, 
CyclicBehaviour). NoneTask and noneAction express the 
fact that the agent did not perform yet neither task nor 
action. The common methods: action, and done (according 
to JADE framework) for all subclass tasks are expressed 
through messages. 
Role 
(Super 
class) 
(fmod PASSI-ROLE is 
   sorts Role, NextPlayedRole . 
   op noneRole : -> Role .  
*** Specifying that the agent is in   
***initialization step, no role played yet 
endfm) 
The Role concept is represented by a functional module in 
which a sort called Role is defined. In this module, 
NextPlayedRole sort is also defined to express the 
[ROLE_CHANGE] relationship specified during Roles 
Description phase. To express that the agent didn’t play any 
role yet, the operator noneRole is defined. 
Agent 
(Super 
class) 
(omod PASSI-AGENT is 
  pr PASSI-ROLE . 
  pr PASSI-TASK . 
  pr AGENT-STATES .      
  *** PASSI Agent class declaration  
  class Agent | playsRole : Role,     
              performsTask : Task ,  
              executesTaskAction : Msg,  
              currentState :  AgentState . 
  *** JADE commun methods for all  
  *** subclass agents 
msgs setup registerToDF takeDown : 
ParametersList -> Msg . 
endom) 
The Agent concept is represented by an object-oriented 
module in which a class called Agent having four (4) 
attributes is defined: 1) playsRole: of sort Role (defined in 
the imported PASSI-ROLE module), signifies which role is 
played by the agent in a given moment. 2) performsTask: 
of sort Task (defined in the imported PASSI-TASK 
module), signifies which task the agent is performing in a 
given moment. 3) executesTaskAction: of sort Msg 
(predefined in Full-Maude), signifies which action the 
agent is executing in a given moment. 4) currentState: of 
sort AgentState (defined in the imported AGENTS-
STATES module, see Figure 9), identifies the state of the 
agent in a given moment among all its possible states 
specified in the Single Agent Behaviour Description 
diagram. The common methods: setup, registerToDF and 
takedown (according to JADE framework) for all agents 
subclasses are expressed through messages. 
242 Informatica 41 (2017) 233–252 M. Mazouz et al. 
 Figure 10:  M-A-B-D-RELATIONS module. 
Despite the many potential benefits that formal 
specifications offer, they suffer from two major limits, 
scalability and familiarity of the developers using them 
with the logics/languages on which the formal methods 
are based on. The first limit pushes the researchers’ 
community to make formal methods applicable not only 
to small-scale applications but also to large-scale 
applications. The second means that the developers using 
formal methods need to have a high degree of 
mathematical maturity as well as languages the formal 
methods they use are based on. To overcome these limits 
in our approach, we have developed a tool, Formal-PASSI 
Toolkit (see section 6). In one hand, the developed tool 
should contribute to scale up our approach. In the other 
hand, it limits the intervention of developers in the 
specification of properties to be checked, and let them deal 
with the semi-formal notation (UML notation) that PASSI 
is based on. 
5.2 Formal Validation 
The particularity of the generated formal description, 
knowing that it is developed using objects, messages, and 
rewriting rules, is that it is executable. As Maude is a very 
versatile environment in terms of simulation, it is possible 
to define a customized initial state (initial configuration) 
and to execute this configuration of the system. Two 
diagrams (until now) are considered by the validation: 
Single and Multi-Agents Behaviour Description diagrams. 
For the first one, the validation process begins by 
introducing one or more initial configuration(s) of an 
agent with its knowledge (ontology elements). For the 
second diagram (MABD diagram), the validation process 
                                                          
1 https://www.embarcadero.com/fr/products/cbuilder 
begins by introducing one or more initial configuration(s) 
composed of all agents and the knowledge they need. 
After executing the simulation, the developer has to read 
the obtained results from the given initial configuration(s) 
and judge if it is expected or not. If the given result(s) is 
(are) undesirable(s), he should take a look to the SABD 
diagram and/or MABD diagram for a certain 
modification. 
5.3 Formal Specification of System 
Properties 
In this phase, the designer (that is supposed to be familiar 
with Linear Temporal Logic and Maude language) has to 
specify formally some properties (desirables or not) of the 
MAS (Multi-agent abstract level) and of individual agents 
(Single-agent abstract level) to be checked in the next 
phase. A list of properties related specifically to multi-
agent systems will be the subject of a future paper. For 
that, as a starting point, all states of an agent 
“AgentName1” should be specified as elementary 
predicates in a system module “AGENT-NAME1-
PREDICATES”. Since a MAS is composed of agents, a 
property of a MAS is constructed by the composition of 
elementary predicates (each of them expresses an agent in 
one of its states) via LTL operators. 
5.4 Formal Verification 
During this phase, a model checking of some PASSI 
behavioural diagrams is performed. Model checking aims 
at applying an exhaustive analysis of all possible 
execution paths of a system, and to determine if some 
properties (identified in the previous phase) are satisfied 
or not. Applying this technique on the formal description, 
generated previously, is very important to verify 
Multi/Single-A-B-D diagrams. This would have the 
advantage of applying model checking before passing to 
Code Model and to avoid propagation of subtle errors 
introduced at the level of the three models (System 
requirement model, Agent society model and 
Implementation Model), with the remainder of the 
development process (Code Model, Agent Test activity, 
Deployment Model and Society Test activity). 
6 Formal PASSI Toolkit 
To make F-PASSI valid and its adoption wider by 
researchers (and possibly industry with more big 
dimension MASs), we must offer users the tool(s) to 
support it. For that, we have developed a prototype toolkit, 
F-PTK (Formal PASSI-Toolkit), using C++ Builder XE71 
IDE. Figure 11 shows the developed toolkit. 
Among the options offered by F-PTK in its version 
1.0, we mention: (1) Edit the different PASSI diagrams, 
(2) Detect automatically the different paths defined in the 
Multi-Agent Behaviour Description diagram and translate 
it as Maude-Strategies to be used in the formal validation 
phase, (3) Check the consistency of the diagrams, (4) 
Serialize these diagrams for later use (to XML file), (5) 
(omod M-A-B-D-RELATIONS is                          
 inc CONFIGURATION . 
 inc STRING . 
 pr PASSI-ROLE . 
 pr PASSI-TASK . 
 sorts OntologyElement Performative . 
 subsort Cid < OntologyElement . 
 subsort String < Performative . 
 sorts Initiator Participant . 
 subsort Cid < Initiator . 
 subsort Cid < Participant . 
*** Relations among task actions 
 msgs invocation Done : Msg -> Msg . 
*** <Task class name> Relation among tasks> 
 msg newTask : Task -> Msg .  
*** <OntologyElement class name> 
 msg message : OntologyElement Performative -> Msg . 
    *** Action Task Agent   
 msg TaskActionExecuted : Msg Cid Cid -> Msg .  
*** Task Role Agent 
 msg TaskInstantiated : Cid Role Cid -> Msg .  
 msg MessageSent : Initiator Participant    
         OntologyElement Performative -> Msg . 
 msg FinalState : ParametersList -> Msg . 
endom) 
Formal Development of Multi-Agent Systems with... Informatica 41 (2017) 233–252 243 
Generate the Maude-based formal description of the 
MAS, (6) Save the generated formal description as an 
XML file, (7) Validate the generated description, and (8) 
Verify the generated description after giving the properties 
to be checked 
In addition to the fact that F-PTK supports Formal-
PASSI, it is characterized mainly from PTK [58] by being 
based on our proposed traceability meta-model for PASSI 
methodology [60]. This will guide developers when 
designing the different diagrams and facilitate theirs tasks.   
7 Case study 
Our proposed extension will be made concrete and 
illustrated using the ATM (Automated Teller Machine) 
case study. The MAS to be designed will control a 
simulated automated teller machine having a magnetic 
stripe reader to read an ATM card, a customer console to 
interact with customers, a slot to deposit envelopes, a 
dispenser for cash and a printer to print customer receipts. 
A customer should insert an ATM card and enter a PIN 
(Personal Identification Number). The Card information 
and the entered PIN will be sent to the bank for validation 
before each session. After validating the customer’s card 
and PIN, the customer will then be able to perform one or 
more transactions. The customer could regain its card 
when he/she desires no further transaction, or when he/she 
decides to abort the transaction in progress. The designed 
ATM provides the following basic services: (1) Perform a 
cash withdrawal from the account related to the inserted 
card; (2) Perform a deposit to any account related to the 
inserted card; (3) Perform a transfer of money between 
any two accounts linked to the inserted card; (4) Perform 
a balance inquiry of any account related to the inserted 
card; (5) Abort a transaction in progress if the “Cancel” 
key is pressed by the customer. 
7.1 Our design of the ATM case study 
through PASSI 
In this section, we show our own design of the ATM case 
study described above. For space limitation reasons, only 
some of the diagrams adopted in our formalization 
approach (until now) are showed or discussed. 
Agents Identification (AI): Three agents are identified: 
(1) Mediator Agent: It is responsible of displaying 
information on the ATM screen (about ATM available 
options, information after a successful transaction, etc.), 
reading customer’s ATM card. (2) Transaction Manager 
Agent: It is responsible of performing transactions, 
reporting transactions, printing receipts for successful 
transaction. (3) Security Responsible Agent: It is 
responsible of checking customer’s card, authenticating 
the customer, ensuring privacy when a transaction is in 
progress. 
Domain Ontology Description (DOD): In this step, the 
knowledge of the domain is described from an ontological 
perspective. For example, the concept “Transaction” is 
identified with its identifier, its date, its time, etc. The fact 
of being “withdrawal”, “Inquiry”, Transfer” and 
“Deposit” kinds of transaction, this made them identified 
as concepts inheriting the “Transaction” concept. Also, the 
 
Figure 11: Formal PASSI Toolkit1.0. 
244 Informatica 41 (2017) 233–252 M. Mazouz et al. 
predicate “IsTransactionPerformed” is identified to know 
if the transaction is successfully performed 
(isTransPerfValue=true) or not (isTransPerfValue=false). 
Roles Description  (RD): In this step, the roles played by 
agents are packaged (see Figure 12). For the 
“TransactionManager” agent, two roles are identified: 
“Performer” which represents the case in which the agent 
is performing a transaction, “Reporter” which describes 
the case in which the agent is reporting a transaction. 
Whereas, three roles are identified for both of 
“SecurityResponsible” and “Mediator” agents. 
“AccountChecker”, “Authenticator” and “Saver” for the 
first one, “CardReader”, “AmountChecker” and 
“Dispenser” for the last.  
Single-Agent Structure Definition (SASD): The 
“TransactionManager” agent has ten tasks to perform 
when playing its roles. Among them, we mention for 
example, the “AbortTransaction” task, which is performed 
when the ATM customer presses the “Cancel” button to 
abort the transaction in progress. However, the 
“AskForDispensing” task, is performed to ask the 
“Mediator” agent to dispense the customer’s desired 
amount. 
Multi-Agent Behaviour Description (MABD): Figure 13 
shows a part of the Multi-Agent Behaviour Description 
diagram we design for our case study. The figure shows 
how task actions are executed, the different messages sent 
between different agents or tasks. For example, the 
message (Notification, Inform) is sent by the “Security-
Responsible” agent (its “sendReportNotification” task 
action) to the “Transaction-Manager” agent (its 
“receiveReportNotification” task action).
  
Figure 12: Roles Description diagram of ATM case study.
Formal Development of Multi-Agent Systems with... Informatica 41 (2017) 233–252 245 
 
Figure 13: A part of Multi-Agent Behaviour Description diagram of ATM case.
Single-Agent Behaviour Description diagram: Figure 14 
shows a finite state machine representing the behaviour of 
the agent “Transaction Manager”. We identified twelve 
(12) states for this agent. For example, after asking the 
Mediator agent for dispensing money (“AskingFor-
MoneyDispensing” state), the TransactionManager agent 
will be in the “NotifyingForTransactionEnd” state by 
executing “notifyEndOfTransaction(aNotification : 
Notification)” task action. 
 
Figure 14: Single-Agent Behaviour Description diagram of ATM case study -Transaction Manager-Agent-. 
 
 
246 Informatica 41 (2017) 233–252 M. Mazouz et al. 
7.2 Formal Model for ATM 
Formal Description Production 
Using F-PTK, a Maude specification of the MAS under 
development is produced. As we have mentioned before, 
the domain ontology elements (concepts, predicates and 
actions) are translated in Maude as classes defined in 
object-oriented modules. The following figure (Figure 15) 
shows the corresponding Maude-representation of the 
“IsAuthenticated” predicate. The attribute “isAuthValue” 
of Boolean type, expresses if the customer having the 
account “accountNum” is authenticated or not. 
Figure 15: ”IsAuthenticated” predicate in Maude. 
The structure of the “TransactionManager” agent is 
represented in Maude, as Figure 16 shows, by an object-
oriented module. A class with the same name as the 
agent’s name is defined (line: 979). This class (as any 
other agent’s class) has to inherit (line: 980) the “Agent” 
class (defined in AGENT-PASSI module, see Table.1). 
The roles played by this agent (Performer and Reporter) 
are captured from Roles Description diagram, and the 
modules in which they are defined are imported (lines: 
973,974) as well as the module representing the domain 
ontology (line: 976).  
Figure 16: Single-Agent Structure Definition module of 
the Agent Transaction Manager. 
Figure 17 demonstrates the module representing the 
MULTI-AGENT-BEHAVIOUR-DESCRIPTION diagram. 
This module imports the following modules: 1) MULTI-
AGENT-STRUCTURE-DEFINITION module (line: 
1014). 2) M-A-B-D-RELATIONSHIPS module (line: 
1016). 3) All modules representing the states of agents 
composing the MAS (lines: 1018, 1019 and 1020).  
Figure 17: MULTI-AGENT-STRUCTURE-DEFINITION   
module and a part of MULTI-AGENT-BEHAVIOUR-
DESCRIPTION  module. 
All relationships relating Task Actions appearing in the 
Multi-Agent Behaviour Description diagram are translated 
as rewriting rules. The execution of each rewriting rule 
affects the agents’ states and the used ontology elements. 
Figure.18 shows a rewriting rule (labelled by: MABD-
35, line: 1433) which represents the execution of the task 
action “notifiyForAuthenticationResult” in the case of 
invalid PIN entered. In which case, the “Security 
Responsible” agent’s state is changed from 
“AuthenticatingTheCustomer” to 
“SendingAuthenticationResult” (lines: 1438 and 1446), 
also, the predicate object “IsAuthent” with the value false, 
and a notification object “notif” with the content “Your 
Pin is invalid, please enter a correct one” are generated 
(lines: 1441,1442-1443). 
Figure.19 shows a part of strategies captured from the 
MABD diagram and defined in the strategic module 
MULTI-AGENT-BEHAVIOUR-DESCRIPTION-
PATHS. 
Formal Validation 
Once the Maude-based formal description of the MAS is 
generated, a formal validation by simulation becomes 
possible. Figure 20 shows an initial configuration in which 
a customer called “Mazouz Salim” (line: 2321) chooses to 
perform a withdraw transaction of an amount of : € 500,00  
(line: 2323). The three (3) agents are, in first time, 
initialized (lines: 2326, 2328 and 2331).
 
 
 
 
 
 
(omod IS-AUTHENTICATED-PREDICATE is 
   pr BOOL . 
   pr STRING . 
   class IsAuthenticated | isAuthValue : Bool,  
                                        accountNUM : String . 
endom)   
(fmod MULTI-AGENT-STRUCTURE-DEFINITION is 
 inc MEDIATOR-SINGLE-AGENT-STRUCTURE-DEFINITION . 
 inc SECURITY-RESPONSIBLE-SINGLE-AGENT-      
       STRUCTURE-DEFINITION . 
 inc TRANSACTION-MANAGER-SINGLE-AGENT- 
        STRUCTURE-DEFINITION . 
endfm)                                               
**************************** 
(omod MULTI-AGENT-BEHAVIOUR-DESCRIPTION is  
 inc MULTI-AGENT-STRUCTURE-DEFINITION .***line : 1014 
  *** 
  inc M-A-B-D-RELATIONS .  ***line : 1016 
  ***   
  inc MEDIATOR-AGENT-STATES .   ***line : 1018                                 
  inc TRANSACTION-MANAGER-AGENT-STATES .***line: 1019                                      
  inc SECURITY-RESPONSIBLE-AGENT-STATES . ***line : 1020 
  … 
endom) 
(omod TRANSACTION-MANAGER-SINGLE 
            -AGENT-STRUCTURE-DEFINITION is 
   pr PASSI-AGENT .  *** line 970 
   pr MESSAGE . 
   *** Roles modules importation 
   pr PERFORMER-ROLE . *** line 973 
   pr REPORTER-ROLE . *** line 974 
   *** The "Domain Ontology Description"  
  *** module importation 
   pr DOMAIN-ONTOLOGY-DESCRIPTION .*** line 976 
   *** Modules importation for different Maude types 
   *** Agent class declaration  
   class TransactionManager | transaction : Oid .*** line 979 
   subclass TransactionManager < Agent . *** line 980 
endom) 
Formal Development of Multi-Agent Systems with... Informatica 41 (2017) 233–252 247 
Figure 18: A rewriting rule of the MABD module. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 19: A part of the strategic module representing the 
different paths of Multi-Agent Behaviour Description diagram. 
 
 
Figure 20: An initial configuration. 
 
The results of simulating the initial configuration 
(Figure.20) by executing the strategy “Path1” (see lines: 
2215, 2216 and 2217 in Figure 19) are showed in Figure 
21. This strategy illustrates the case, in which, the inserted 
customer’s card was valid, the customer has been 
authenticated and the withdrawal transaction has been 
successfully performed.  
The results of this phase gives the developer more 
information about agents by means of their states, the roles 
they played, tasks they performed, and task actions they 
executed in addition to the current values of ontology 
element’s attributes. For example, the predicate 
“isCardVal” (framed by the black in Figure 21) gives us 
the information that the card inserted by the customer was 
rl[ MABD-35 ] :  *** line :1433 
     invocation(notifyForAuthenticationResult(notif))         
     < custPIN : CustomerPIN | customerAccountNO : accno, PIN : pin, customerName : custName > 
     < secRes: SecurityResponsible | playsRole : Authenticator, performsTask : Authenticate,  
                                                          executesTaskAction : authenticate(EmptyParametersList),  
                                                          currentState: AuthenticatingTheCustomer > ***line : 1438 
     => 
     message(Notification, "Inform")  
     < “isAuthent” : IsAuthenticated | isAuthValue : false, accountNUM : accno > ***line : 1441 
     < “notif” : Notification | notifID : notID,           ***line : 1442 
                                             content : "Your Pin is invalid, please enter a correct one" > ,  ***line : 1443 
     < secRes : SecurityResponsible | playsRole : Authenticator, performsTask : Authenticate,  
                                                           executesTaskAction : notifyForAuthenticationResult(notif),  
                                                           currentState : SendingAuthenticationResult > . ***line : 1446 
(smod MULTI-AGENT-BEHAVIOUR-DESCRIPTION-PATHS is 
  strat Root : @ Configuration . 
  sd Root := ( MABD-01 ; MABD-02 ; MABD-03 ; MABD-04 ;        
                      MABD-05 ; MABD-06 ; MABD-07 ; MABD-08 ;   
                      MABD-09 ). 
   … 
  strat Parall1-1 : @ Configuration . 
  sd Parall1-1 := ( Branch1-6 | Branch1-7 )! .     
  *** Case of well passed scenario 
  strat Path1 : @ Configuration . *** line 2215 
  sd Path1 := ( Root ; Branch1-1 ; Branch1-2 ; Branch1-3 ;  *** line 2216 
                       Branch1-4 ;  Branch1-5 ;  Parall1-1 ; Branch1-8 ) .  *** line 2217 
  … 
endsm)      
248 Informatica 41 (2017) 233–252 M. Mazouz et al. 
valid. <"isCardVal" : IsCardValid | cardNo : 
"b2307025156", isCardValValue : true>.  
Formal Specification of System Properties 
In this phase, some of MAS properties are identified and 
then specified in Linear Temporal Logic as predicates in 
Maude. Table 2 shows some properties for the ATM case 
study and their specification in LTL logic. Three of these 
properties (desirable properties) should be satisfied by the 
MAS, whilst, the others are undesirable and the MAS 
should not satisfy them.    
Figure 21: Result of the simulation (Scenario well passed) 
By the command (srew initialConfig using Path1 .) 
N° Property in LTL Description 
Desirable 
Property 
Single/Multi-
Agent 
1 
MedAgent-ReadingCustomerChoice( 
medAgent)  
|-> <> MedAgent-
AskingForATransaction( 
medAgent) 
This property expresses the fact that if the 
Mediator Agent reads the customer choice 
(ReadingCustomer-Choice state), then it will 
eventually send a transaction order to the 
TransactionManager agent soon. 
Yes 
Single-Agent: 
Mediator 
2 
TransManAgent-
ReceivingTransactionOrder( 
transactionMan)  
|-> TransManAgent-
PerformingTheTransaction( 
transactionMan) 
This property expresses the fact that if the 
TransactionManager Agent receives a 
transaction order (ReceivingTransactionOrder 
state), then the state Performing-
TheTransaction expressing that it is 
performing the transaction will be true soon. 
Yes 
Single-Agent: 
TransactionMana
ger 
 
3 
SecResAgent-
ReceivingCardCheckingOrder( 
secRes)  
-> 
O  SecResAgent-
NotifyingCardCheckingResult (secRes) 
This property expresses the fact that if the 
Security Responsible agent receives a card 
checking order, it will notify directly the 
results of checking (without checking it first).    
No 
Single-Agent: 
Security 
Responsible 
4 
MedAgent-AskingForATransaction( 
medAgent) ->  
O TransManAgent- 
ReceivingTransactionOrder( 
traManagerAgent) 
This property expresses the fact that if the 
Mediator agent sends a transaction order to the 
TransactionManager agent, the last one will be 
in ReceivingTransactionOrder state. 
Yes Multi-Agent 
5 
[] (MedAgent-ReadingCustomerChoice( 
medAgent) |-> TransManAgent-
NotifyingForTransactionEnd(traManager
Agent)) 
This property expresses the situation:  
Always, if the MediatorAgent is reading the 
customer choice, the TransactionManager 
agent will notify for the end of the transaction. 
No Multi-Agent 
Table 2: Some specified properties for the ATM case study.
 
 
 
Formal Development of Multi-Agent Systems with... Informatica 41 (2017) 233–252 249 
Formal Verification 
After the specification of properties (Table 2), a 
verification by means of model checking technique is 
applied. Figure 22 illustrates the results given by Maude-
Model checker on different initial configurations.  In the 
case of desirable properties unsatisfied or undesirable 
properties satisfied (like the third and the fifth properties 
in the table above), the developer has to review the 
corresponding diagrams for modification.
Figure 22: Results of applying model checking.
8 Conclusion and future work  
Several methodologies supporting MAS development 
have been proposed in the literature. Only few of them 
have addressed the use of formal techniques in the 
development process. Despite the fact that PASSI 
methodology have many advantages such as the coverage 
of most development phases, the design of FIPA-based 
MASs1, the use of the common modelling language 
(UML) and the plenty of documentations (Web site2, lots 
of published papers), it lacks formal foundations. In this 
paper, we have presented an extension for PASSI 
methodology to support formal development of MAS. The 
extension is made by integrating a new model (Formal 
Model) into the PASSI design process. The integrated 
model is based on the rewriting logic and its language 
Maude (and its extension Maude-Strategy). It aims at 
offering a Maude specification of the MAS under 
development. Having the formal specification gives the 
developer the possibility to validate by simulation (thanks 
to Maude) of both single & multi-agent behaviour 
                                                          
1 http://www.fipa.org/resources/methodologies.html 
descriptions. In addition, some properties (of both single 
& multi-agent abstract levels) have to be specified in 
Maude by the developer to check it by LTL Maude model 
checker. Unlike many works in the literature, our work 
consists of integrating formal techniques not only in some 
design pieces separately of any development 
methodology, but in an entire design process (of PASSI 
methodology). This integration enhances PASSI 
methodology and leads, at the end of the design process, 
to the development of more reliable, robust and correct 
MASs. Moreover, supporting Formal PASSI by a tool (F-
PTK) facilitates the tasks of the developer and would 
contribute to scale up our approach. Formal PASSI uses 
formal (rewriting logic-based) and semi-formal (UML 
notation) specifications, this benefits of the advantages of 
the two specifications. Our work is still in progress. As a 
future work, we plan to: (1) Introduce more PASSI 
diagrams in the formalization approach, (2) Formalize 
PASSI’s predefined patterns using Maude, (3) Define and 
check MAS specific properties, (4) Enhance the F-PTK by 
adding the possibility of visualizing and animating the 
2 http://www.pa.ica r.cnr.it/passi/Passi/PassiIndex.html 
250 Informatica 41 (2017) 233–252 M. Mazouz et al.  
 
Formal Validation results to make them more readable, 
(5) Propose (or use) a graphical notation to describe LTL’s 
operators in order to facilitate the Formal Specification of 
System Properties phase for developers who are not 
familiar with LTL.    
References 
[1]  Cossentino, M. (2005). From requirements to code 
with the PASSI methodology. In B. Henderson- 
Sellers & P. Giorgini (Eds.), Agent-oriented 
methodologies. Hershey, PA, USA: Idea Group 
Publishing: Chap. IV, pp. 79–106.  
[2]  Cossentino, M. and V. Seidita (2014). PASSI: 
Process for Agent Societies Specification and 
Implementation. Handbook on Agent-Oriented 
Design Processes. M. Cossentino, V. Hilaire, A. 
Molesini and V. Seidita, Springer Berlin Heidelberg: 
287-329.  
[3]  Cernuzzi, L., T. Juan, L. Sterling and F. Zambonelli 
(2004). The Gaia Methodology. Methodologies and 
Software Engineering for Agent Systems: The 
Agent-Oriented Software Engineering Handbook. F. 
Bergenti, M.-P. Gleizes and F. Zambonelli. Boston, 
MA, Springer US: 69-88. 
[4]  Cernuzzi, L., A. Molesini and A. Omicini (2014). 
The Gaia Methodology Process. Handbook on 
Agent-Oriented Design Processes. M. Cossentino, 
V. Hilaire, A. Molesini and V. Seidita. Berlin, 
Heidelberg, Springer Berlin Heidelberg: 141-172. 
[5]  Bernon, C., M.-P. Gleizes, S. Peyruqueou and G. 
Picard (2003). ADELFE: A Methodology for 
Adaptive Multi-agent Systems Engineering. 
Engineering Societies in the Agents World III: Third 
International Workshop, ESAW 2002 Madrid, 
Spain, September 16–17, 2002 Revised Papers. P. 
Petta, R. Tolksdorf and F. Zambonelli. Berlin, 
Heidelberg, Springer Berlin Heidelberg: 156-169. 
[6]  Bonjean, N., W. Mefteh, M. P. Gleizes, C. Maurel 
and F. Migeon (2014). ADELFE 2.0. Handbook on 
Agent-Oriented Design Processes. M. Cossentino, 
V. Hilaire, A. Molesini and V. Seidita. Berlin, 
Heidelberg, Springer Berlin Heidelberg: 19-63. 
[7]  Mefteh, W., F. Migeon, M.-P. Gleizes and F. 
Gargouri (2015). ADELFE 3.0 Design, Building 
Adaptive Multi Agent Systems Based on Simulation 
a Case Study. Computational Collective 
Intelligence: 7th International Conference, ICCCI 
2015, Madrid, Spain, September 21-23, 2015, 
Proceedings, Part I. M. Núñez, T. N. Nguyen, D. 
Camacho and B. Trawiński. Cham, Springer 
International Publishing: 19-28. 
[8]  Winikoff, M. and L. Padgham (2004). The 
Prometheus Methodology. Methodologies and 
Software Engineering for Agent Systems: The 
Agent-Oriented Software Engineering Handbook. F. 
Bergenti, M.-P. Gleizes and F. Zambonelli. Boston, 
MA, Springer US: 217-234. 
[9]  Giorgini, P., M. Kolp, J. Mylopoulos and M. Pistore 
(2004). The Tropos Methodology. Methodologies 
and Software Engineering for Agent Systems. F. 
Bergenti, M.-P. Gleizes and F. Zambonelli, Springer 
US. 11: 89-106. 
[10]  Pavón, J. and J. Gómez-Sanz (2003). Agent Oriented 
Software Engineering with INGENIAS. Multi-Agent 
Systems and Applications III: 3rd International 
Central and Eastern European Conference on Multi-
Agent Systems, CEEMAS 2003 Prague, Czech 
Republic, June 16–18, 2003 Proceedings. V. Mařík, 
M. Pěchouček and J. Müller. Berlin, Heidelberg, 
Springer Berlin Heidelberg: 394-403. 
[11]  Winikoff, M. (2010). Assurance of Agent Systems: 
What Role Should Formal Verification Play? 
Specification and Verification of Multi-agent 
Systems. M. Dastani, V. K. Hindriks and C. J.-J. 
Meyer. Boston, MA, Springer US: 353-383. 
[12]  Cossentino, M. and C. Potts (2002). PASSI: A 
process for specifying and implementing multi-agent 
systems using UML. Retrieved October 8: 2007. 
[13]  Basin, D., M. Clavel and J. Meseguer (2000). 
Rewriting Logic as a Metalogical Framework. FST 
TCS 2000: Foundations of Software Technology and 
Theoretical Computer Science: 20th Conference 
New Delhi, India, December 13–15, 2000 
Proceedings. S. Kapoor and S. Prasad. Berlin, 
Heidelberg, Springer Berlin Heidelberg: 55-80. 
[14]  Meseguer, J. (2005). A Rewriting Logic Sampler. 
Theoretical Aspects of Computing. ICTAC 2005: 
Second International Colloquium, Hanoi, Vietnam, 
October 17-21, 2005. Proceedings. D. Hung and M. 
Wirsing. Berlin, Heidelberg, Springer Berlin 
Heidelberg: 1-28. 
[15]  Clavel, M., F. Durán, S. Eker, P. Lincoln, N. Martı́-
Oliet, J. Meseguer and J. F. Quesada (2002). Maude: 
specification and programming in rewriting logic. 
Theoretical Computer Science 285(2): 187-243. 
[16]  Clavel, M., F. Durán, S. Eker, P. Lincoln, N. Martí-
Oliet, José, Meseguer and C. Talcott (2007). All 
about maude - a high-performance logical 
framework: how to specify, program and verify 
systems in rewriting logic, Springer-Verlag. 
[17]  N. Martí-Oliet, José, Meseguer and A. Verdejo 
(2005). Towards a Strategy Language for Maude. 
Electron. Notes Theor. Comput. Sci. 117: 417-441. 
[18]  Eker, S., J. Meseguer and A. Sridharanarayanan 
(2003). The Maude LTL Model Checker and Its 
Implementation. Model Checking Software: 10th 
International SPIN Workshop Portland, OR, USA, 
May 9–10, 2003 Proceedings. T. Ball and S. K. 
Rajamani. Berlin, Heidelberg, Springer Berlin 
Heidelberg: 230-234. 
[19]  El Fallah-Seghrouchni, A., J. J. Gomez-Sanz and M. 
P. Singh (2011). Formal Methods in Agent-Oriented 
Software Engineering. Agent-Oriented Software 
Engineering X: 10th International Workshop, AOSE 
2009, Budapest, Hungary, May 11-12, 2009, 
Revised Selected Papers. M.-P. Gleizes and J. J. 
Gomez-Sanz. Berlin, Heidelberg, Springer Berlin 
Heidelberg: 213-228. 
[20]  Ball, E. (2008). An Incremental Process for the 
Development of Multi-agent Systems in Event-B. 
Doctoral thesis, University of Southampton. 
Formal Development of Multi-Agent Systems with... Informatica 41 (2017) 233–252 251 
[21]  Ball, E. and M. Butler (2006). Using Decomposition 
to Model Multi-agent Interaction Protocols in Event-
B. FM'06 Doctoral Symposium, Springer. 
[22]  Abrial, J.-R. (2010). Modelling in Event-B: System 
and Software Engineering, Cambridge University 
Press. 
[23]  Ball, E. and M. Butler (2009). Event-B Patterns for 
Specifying Fault-Tolerance in Multi-agent 
Interaction. Methods, Models and Tools for Fault 
Tolerance. M. Butler, C. Jones, A. Romanovsky and 
E. Troubitsyna. Berlin, Heidelberg, Springer Berlin 
Heidelberg: 104-129. 
[24]  Hadj-Kacem, A., A. Regayeg and M. Jmaiel (2007). 
ForMAAD: A formal method for agent-based 
application design. Web Intelli. and Agent Sys. 5(4): 
435-454. 
[25]  Graja, Z., A. Regayeg and A. H. Kacem (2011). 
ForMAAD : Towards a Model Driven Approach for 
Agent Based Application Design. Agent-Oriented 
Software Engineering XI: 11th International 
Workshop, AOSE 2010, Toronto, Canada, May 10-
11, 2010, Revised Selected Papers. D. Weyns and 
M.-P. Gleizes. Berlin, Heidelberg, Springer Berlin 
Heidelberg: 148-164. 
[26]  Cervenka, R. and I. Trencansky (2007). The Agent 
Modelling Language - AML: A Comprehensive 
Approach to Modelling Multi-Agent Systems. 
Birkhäuser Basel. 
[27]  Regayeg, A., Hadj-Kacem, A., Jmaiel, M. (2004). 
Specification and Verification of Multi-Agent 
Applications using Temporal Z. In Proceedings of 
the IEEE/WIC/ACM International Conference on 
Intelligent Agent Technology 2004 (IAT'2004), 
Beijing, China: 260–266. 
[28]  Fuxman, A., M. Pistore, J. Mylopoulos and P. 
Traverso (2001). Model checking early requirements 
specifications in Tropos. Requirements Engineering, 
2001. Proceedings. Fifth IEEE International 
Symposium on, IEEE. 
[29]  Dardenne, A., A. v. Lamsweerde and S. Fickas 
(1993). Goal-directed requirements acquisition. 
Selected Papers of the Sixth International Workshop 
on Software Specification and Design, Elsevier 
Science Publishers B. V.: 3-50. 
[30]  Cimatti, A., E. Clarke, F. Giunchiglia and M. Roveri 
(2000). NUSMV: a new symbolic model checker. 
International Journal on Software Tools for 
Technology Transfer 2(4): 410-425. 
[31]  Montali, M., P. Torroni, N. Zannone, P. Mello and 
V. Bryl (2011). Engineering and verifying agent-
oriented requirements augmented by business 
constraints with 𝛽-Tropos. Autonomous Agents and 
Multi-Agent Systems 23(2): 193-223. 
[32]  Bryl, V., P. Mello, M. Montali, P. Torroni and N. 
Zannone (2008). 𝛽-Tropos. Computational Logic in 
Multi-Agent Systems: 8th International Workshop, 
CLIMA VIII, Porto, Portugal, September 10-11, 
2007. Revised Selected and Invited Papers. F. Sadri 
and K. Satoh. Berlin, Heidelberg, Springer Berlin 
Heidelberg: 157-176. 
[33]  Alberti, M., F. Chesani, M. Gavanelli, E. Lamma, P. 
Mello and P. Torroni (2008). Verifiable agent 
interaction in abductive logic programming: The 
SCIFF framework. ACM Trans. Comput. Logic 
9(4): 1-43. 
[34]  Fadil, H. and J.-L. Koning (2005). A Formal 
Approach to Model Multiagent Interactions Using 
the B Formal Method. Advanced Distributed 
Systems: 5th International School and Symposium, 
ISSADS 2005, Guadalajara, Mexico, January 24-28, 
2005, Revised Selected Papers. F. F. Ramos, V. 
Larios Rosillo and H. Unger. Berlin, Heidelberg, 
Springer Berlin Heidelberg: 516-528. 
[35]  Abrial, J.-R. (1996). The B-book: assigning 
programs to meanings. Cambridge University Press. 
[36]  Robinson, K. (1997). The B method and the B toolkit. 
Algebraic Methodology and Software Technology: 
6th International Conference, AMAST'97 Sydney, 
Australia, December13–17, 1997 Proceedings. M. 
Johnson. Berlin, Heidelberg, Springer Berlin 
Heidelberg: 576-580. 
[37]  Jemni Ben Ayed, L. and F. Siala (2008). 
Specification and Verification of Multi-agent 
Systems Interaction Protocols Using a Combination 
of AUML and Event B. Interactive Systems. Design, 
Specification, and Verification: 15th International 
Workshop, DSV-IS 2008 Kingston, Canada, July 16-
18, 2008 Revised Papers. T. C. N. Graham and P. 
Palanque. Berlin, Heidelberg, Springer Berlin 
Heidelberg: 102-107.  
[38]  Bauer, B., J. P. Müller and J. Odell (2001). Agent 
UML: A Formalism for Specifying Multiagent 
Software Systems. Agent-Oriented Software 
Engineering: First International Workshop, AOSE 
2000 Limerick, Ireland, June 10, 2000 Revised 
Papers. P. Ciancarini and M. J. Wooldridge. Berlin, 
Heidelberg, Springer Berlin Heidelberg: 91-103. 
[39]  Alagar, V. S. and K. Periyasamy (1998). The Z 
Notation. Specification of Software Systems. New 
York, NY, Springer New York: 281-360. 
[40]  Regayeg , A., Hadj Kacem ,A.,  Jmaiel, M, (2005). 
Towards a formal methodology for developing 
multi-agent applications using temporal Z. The 3rd 
ACS/IEEE International Conference on Computer 
Systems and Applications (AICCSA'05), Cairo, 
Egypt. 
[41]  Roungroongsom, C. and D. Pradubsuwun (2015). 
Formal Verification of Multi-agent System Based on 
JADE: A Semi-runtime Approach. Recent Advances 
in Information and Communication Technology 
2015: Proceedings of the 11th International 
Conference on Computing and Information 
Technology (IC2IT). H. Unger, P. Meesad and S. 
Boonkrong. Cham, Springer International 
Publishing: 297-306. 
[42]  Lapouchnian, A. and Y. Lespérance (2009). Using 
the ConGolog and CASL Formal Agent Specification 
Languages for the Analysis, Verification, and 
Simulation of i* Models. Conceptual Modelling: 
Foundations and Applications: Essays in Honor of 
John Mylopoulos. A. T. Borgida, V. K. Chaudhri, P. 
252 Informatica 41 (2017) 233–252 M. Mazouz et al.  
 
Giorgini and E. S. Yu. Berlin, Heidelberg, Springer 
Berlin Heidelberg: 483-503. 
[43]  Yu, E. S.-K. (1996). Modelling strategic 
relationships for process reengineering. University 
of Toronto. 
[44]  De Giacomo, G., Y. Lespérance and H. J. Levesque 
(2000). ConGolog, a concurrent programming 
language based on the situation calculus. Artificial 
Intelligence 121(1): 109-169. 
[45]  Shapiro, S., Y. Lespérance and H. J. Levesque 
(2002). The cognitive agents specification language 
and verification environment for multiagent systems. 
Proceedings of the first international joint 
conference on Autonomous agents and multiagent 
systems: part 1. Bologna, Italy, ACM: 19-26. 
[46]  Wang, X. and Y. Lespérance (2001). Agent-oriented 
requirements engineering using ConGolog and i*. 
Agent-Oriented Information Systems Workshop 
(AOIS-2001). Montreal, Canada: 59-78. 
[47]  Lapouchnian, A. and Y. Lespérance (2006). 
Modelling Mental States in Agent-Oriented 
Requirements Engineering. Advanced Information 
Systems Engineering: 18th International 
Conference, CAiSE 2006, Luxembourg, 
Luxembourg, June 5-9, 2006. Proceedings. E. 
Dubois and K. Pohl. Berlin, Heidelberg, Springer 
Berlin Heidelberg: 480-494. 
[48]  Xu, H. and S. M. Shatz (2003). ADK: An Agent 
Development Kit Based on a Formal Design Model 
for Multi-Agent Systems. Automated Software 
Engineering 10(4): 337-365.  
[49]  Deng, Y., S. K. Chang, J. C. A. Figueired and A. 
Perkusich (1993). Integrating software engineering 
methods and Petri nets for the specification and 
prototyping of complex information systems. 
Application and Theory of Petri Nets 1993: 14th 
International Conference Chicago, Illinois, USA, 
June 21–25, 1993 Proceedings. M. Ajmone Marsan. 
Berlin, Heidelberg, Springer Berlin Heidelberg: 206-
223. 
[50]  Stamatopoulou, I., P. Kefalas and M. Gheorghe 
(2008). OPERAS: A Framework for the Formal 
Modelling of Multi-Agent Systems and Its 
Application to Swarm-Based Systems. Engineering 
Societies in the Agents World VIII: 8th International 
Workshop, ESAW 2007, Athens, Greece, October 
22-24, 2007, Revised Selected Papers. A. Artikis, G. 
M. P. O’Hare, K. Stathis and G. Vouros. Berlin, 
Heidelberg, Springer Berlin Heidelberg: 158-174. 
[51]  Eilenberg, S. (1974). Automata, Languages, and 
Machines. Academic Press, Inc. 
[52]  Bernardini, F. and Gheorghe, M (2004). Population 
P Systems. Journal of Universal Computer Science 
10(5): 509–539. 
[53]  Keller, R. M. (1976). Formal verification of parallel 
programs. Commun. ACM 19(7): 371-384. 
[54]  Murata, T. (1989). Petri nets: Properties, analysis 
and applications. Proceedings of the IEEE 77(4): 
541-580. 
[55]  Milner, R. (1982). A Calculus of Communicating 
Systems. Springer-Verlag New York, Inc. 
[56]  Diaconescu, R. and Futatsugi, K. (1998). CafeOBJ 
Report: The Language, Proof Techniques, and 
Methodologies for Object-Oriented Algebraic 
Specification. AMAST Series in Computing, vol. 6. 
World Scientific.  
[57]  Ölveczky, P. C. and J. Meseguer (2008). The Real-
Time Maude Tool. Tools and Algorithms for the 
Construction and Analysis of Systems: 14th 
International Conference, TACAS 2008, Held as 
Part of the Joint European Conferences on Theory 
and Practice of Software, ETAPS 2008, Budapest, 
Hungary, March 29-April 6, 2008. Proceedings. C. 
R. Ramakrishnan and J. Rehof. Berlin, Heidelberg, 
Springer Berlin Heidelberg: 332-336. 
[58]  Chella, A., M. Cossentino and L. Sabatucci (2004). 
Tools and patterns in designing multi-agent systems 
with PASSI. WSEAS Transactions on 
Communications 3(1): 352-358. 
[59]  Caire, G., M. Cossentino, A. Negri, A. Poggi and P. 
Turci (2004). Multi-agent systems implementation 
and testing, na. 
[60]  Mazouz, M., F. Mokhati and M. Badri (2015). 
Towards an Explicit Bidirectional Requirement-to-
Code Traceability Meta-model for the PASSI 
Methodology. Proceedings of the International 
Conference on Agents and Artificial (ICAART-
2015), Lisbon, Portugal: 203-209. 
 Informatica 41 (2017) 253–254 253 
Application for Sexually Transmitted Infection Risk Assessment 
Alen Ajanović, Jaka Konda and Gašper Fele-Žorž 
Faculty of Computer and Information Science 
Večna pot 113, 1000 Ljubljana, Slovenia 
E-mail: polz@fri.uni-lj.si 
 
Anton Gradišek and Matjaž Gams 
Jožef Stefan Institute 
Jamova cesta 39, 1000 Ljubljana, Slovenia.  
E-mail: matjaz.gams@ijs.si 
 
Ana Peterlin, Karolina Počivavšek and Mojca Matičič 
Clinic for Infectious Diseases and Febrile Illnesses 
University Medical Centre Ljubljana 
Japljeva 2, 1525 Ljubljana, Slovenia  
E-mail: mojca.maticic@kclj.si 
 
Student paper 
 
Keywords: sexually transmitted infections, health education, web application, website, awareness, prevention and 
health 
 
Received: March 3, 2017 
 
We present a web application to detect risks related to sexually transmitted infections (STIs). The 
application works as a questionnaire about sexual behaviour and risk factors for STIs and, based on the 
answers, calculates the risk of being infected. The application also works as an informational tool with 
educating about STIs and prevention. It uses a combination of approaches from computer science and 
psychology to deliver a usable, clean interface with which the user feels safe. 
Povzetek:  Predstavljen je študentski projekt za detekcijo spolno prenosljivih bolezni, dosegljiv na 
aspo.mf.uni-lj.si. 
1 Introduction 
Sexually transmitted infections (STIs) present an 
important public health issue, since daily nearly a million 
people contract at least one STI, including the human 
immunodeficiency virus (HIV) [1]. According to the 
World Health Organization (WHO), rates of infection are 
the highest among young adults [2]. One of the most 
important reasons is lack of proper information and 
counselling on safe sex, sexual behaviour, and prevention 
of STIs [3]. When searching for medical information, an 
estimated 25% of adolescents turn to web browsing, which 
suggests that online contexts present an open and safe 
space in which young people can express themselves and 
promote healthy habits [4,5]. Because of that, we decided 
to develop a web application, which combines 
scientifically verified information about STIs and is 
designed for young adults. The main component of the 
web application is a questionnaire, which mimics the 
process of diagnosing an STI by a medical doctor. The 
application then estimates the risk level of being infected 
with an STI and advises user on further actions, if needed. 
The application is available at aspo.mf.uni-lj.si.  
2 Related work 
Review of literature between 2010 and 2015 suggests that 
mobile technology is an effective mode for delivering safe 
sex behaviour and sexual health information to young 
adults (19-24 years) [6]. In 2012, more than 1900 
HIV/STI-related applications have been reviewed. Only 6 
applications provided information on all four important 
areas of STI prevention: disease information, information 
about reducing sexual transmission risk, 
promotion/instructions for condom use, and information 
about testing for STIs/testing centers [7]. Therefore, we 
applied these findings to our application. 
3 System description 
3.1 Front-end 
With AngularJS [8], developed by Google, we were able 
to create a dynamic website. Since our target population 
was mostly teenagers, designing an attractive website was 
very important to garner an initial positive reaction from 
the users. By designing the website as a single-page 
application, we are able to use it on a variety of different 
platforms, most importantly mobile. 
254 Informatica 41 (2017) 253–254 A. Ajanović et al.  
 
3.2 Back-end 
While most of the questionnaire logic is implemented 
directly on the front-end side of the application, the current 
back-end is written in Java EE. Back-end serves as a way 
to store questions and their relations in a relational 
database while returning them to the user once they are 
requested. The back-end uses the Java persistence API as 
the object-relational mapper. The service serving 
questionnaire data uses the high-level interfaces and 
annotations used to create RESTful service resources. The 
content of the questionnaire is serialized with the GSON 
library, developed by Google [9]. Since it is designed in a 
way that does not require any specific relational database, 
we are able to use different technologies and provide a 
much more dynamic service for the user. Percona Server 
relational database is used to store the questions for further 
anonymous analysis. The final application is deployed on 
a Wildfly 9.0 Java EE application server [10] hidden 
behind an nginx [11] web server. 
4 Website description 
4.1 General information 
The site stores static information about STIs together with 
providing information about risky sexual behaviour, 
protection for safe sex, sites for getting medical care and 
counselling, and some general information about the 
project. 
4.2 Questionnaire 
The main attraction of the site is the questionnaire, which 
serves as a way for the users to assess their risk level based 
on the answers they provide. It was composed based on 
the actual questionnaire a patient may receive at the Clinic 
for Infectious Diseases and Febrile Illnesses, University 
Medical Centre Ljubljana.  
The questions are split into three sets. The first set 
asks questions about the user's demographic information 
(age, gender, etc). The second set consists of questions of 
user's sexual activity. The last set asks about possible 
symptoms that the user might be experiencing.  
The length and the quantity of the questions mostly 
depends on the user's answers; if they answer a question 
about a particular symptom they may be having with 
"yes", the questionnaire might prompt them to supply 
more information. With this, we ensure that the users 
answer the questions relevant to them, but also learn 
something while they are reading the contents.   
The final risk level is presented as one of three 
different stages (green, yellow and red), where green 
represents the lowest level of risk and red represent the 
highest – strongly advising the user to visit a medical 
doctor.  
All questions are loaded onto the client as soon as the 
user visits our web page. This ensures the responsiveness 
of the questionnaire in the case of an unreliable internet 
connection and reduces the load on our server. 
5 Conclusion 
To approach the growing problem of STIs in a user-
friendly, psychologically inviting, but still a technically 
sound way, we created an application where users can seek 
professional medical advice and help if they ever find 
themselves in a situation which might be considered 
compromising. By supplementing the site with actual facts 
and medical procedures, we encourage and prepare the 
user for a real life world diagnosis while providing all the 
necessary information for a doctor’s visit. 
6 Acknowledgment 
Project was held under public call "Po kreativni poti do 
praktičnega znanja" (Creative path to practical 
knowledge), funded by Ministry of Education, Science 
and Sport, and European Union. Special thanks to other 
participants in the project (Pija Balaban, Toni Bregant, 
Urša Jager, Marko Kladnik, Ana Prodan, Saša Rink). 
References 
[1]  WHO | Report on global sexually transmitted 
infection surveillance 2013: 2014. http:// 
www.who.int/reproductivehealth/publications/rtis/st
is-surveillance-2013/en/. Accessed: 2017-20-03. 
[2]  WHO, 1995. Sexually Transmitted Diseases Three 
Hundred and Thirty-three Million New, Curable 
Cases in 1995. 
[3]  WHO | Sexually Transmitted Infections among 
adolescents: 2005. http://www.who.int/ 
maternal_child_adolescent/documents/9241562889/
en/. Accessed: 2017-03-20. 
[4]  Ybarra, M., Emenyonu, N., Nansera, D., Kiwanuka, 
J. and Bangsberg, D. 2007. Health information 
seeking among Mbararan adolescents: results from 
the Uganda Media and You survey. Health 
Education Research. 23, 2 (2007), 249-258. 
[5]  Ngo, A., Ross, M. and Ratliff, E. 2008. Internet 
influences on sexual practices among young people 
in Hanoi, Vietnam. Culture, Health \& Sexuality. 
[6]  Judith B. Cornelius, Josephine A. Appiah (2016) 
Using mobile technology to promote safe sex and 
sexual health in adolescents: current practices and 
future recommendations https:// 
www.ncbi.nlm.nih.gov/pmc/articles/PMC4829103 
[7]  Kathryn B. Muessig PhD, Emily C. Pike BS, Sara 
LeGrand PhD, Lisa B. Hightow-Weidman MD, 
MPH (2013) Mobile Phone Applications for the Care 
and Prevention of HIV and Other Sexually 
Transmitted Diseases: A Review https:// 
www.ncbi.nlm.nih.gov/pmc/articles/PMC4829103 
[8]  AngularJS https://angularjs.org/ Accessed: 2017-14-
03. 
[9]  Google-GSON https://github.com/google/gson 
Accessed: 2017-14-03. 
[10]  Wildfly  http://wildfly.org Accessed: 2017-14-03. 
[11]  nginx  http://nginx.org Accessed: 2017-14-03.  
 Informatica 41 (2017) 255–256 255
  
Image Processing Procedures Based on Multi-Quadratic Dynamic 
Programming 
Pham Cong Thang  
The University of Da Nang - University of Science and Technology 
54 Nguyen Luong Bang, Da Nang, Viet Nam 
Email: pcthang@dut.udn.vn 
 
Thesis summary 
 
Keywords: dynamic programming, Bayesian framework, Markov random fields (MRFs), edges–preserving smoothing, 
Gibbs energy, pair-wise potential functions  
Received: January 29, 2017 
 
This paper summarizes the doctoral dissertation [1] of the author. The main subject of this thesis is the 
study and development of a method for edge preserving in image smoothing, which is developed based on 
multi-quadratic dynamic programming procedure for maximum a posteriori probability estimation. 
Additionally, a new non-convex type regularization is proposed, with ability to flexibly set a priori 
preferences, using different penalties for various ranges of differences between the values of adjacent 
image elements. Procedures of image processing, as presented here, consider heterogeneities and 
discontinuities in the source data, while retaining high computational efficiency of the dynamic 
programming procedure and Kalman filter-interpolator. Comparative study shows, that proposed 
algorithms has high accuracy to speed ratio, especially in the case of high-resolution images. 
Povzetek: Predstavljena je doktorska disertacija, ki se ukvarja z glajenjem robov pri procesiranju slik. 
Image denoising is a necessary stage in almost every 
computer vision system. It plays an important role in 
improving the quality of further analysis and 
interpretation. One of the main problems encountered in 
image pre-processing and reconstruction is the 
preservation or restoration of the original structure of the 
local edges and sudden changes in brightness. Many edge-
preserving image denoising methods have been proposed 
in the literature. Nevertheless, the monitoring of dynamic 
processes needs to improve the performance of the low-
level processing stage in both speed and accuracy. 
One of the most popular approaches for image 
processing is Bayesian framework. In this approach, the 
problem of image analysis can be expressed as the 
problem of estimation of a hidden component of two-
component random field, where the analyzed image plays 
the role of the observed component. An equivalent 
representation of Markov random fields in the form of 
Gibbs random fields can be used to define prior 
probability properties of a hidden Markov field by means 
of so called Gibbs potentials on cliques. In the case of a 
singular loss function, the Bayesian estimation of the 
hidden component can be found as a maximum a 
posteriori probability (MAP) estimation, which leads us to 
the problem of minimization of the objective function, 
often called the Gibbs energy function [2]. The structure 
of the objective function reflects the ordering property of 
analyzed data and is determined by means of an undirected 
neighborhood graph. In image analysis, objective function 
can often be represented as the sum of the two types of 
potential functions on cliques, called node functions and 
edge functions. In image smoothing node functions play 
the role of a penalty on the difference between the values 
of input data and the sought for function, and are usually 
chosen in quadratic form. Each edge function imposes 
penalty upon the difference of values in the corresponding 
pair of adjacent elements of the edge of neighborhood 
graph, and can have various forms. In a Bayesian 
framework, the edge functions are usually called pair-wise 
potential functions. 
Different types of prior assumptions result in several 
types of pair-wise potential functions [3]. However, 
corresponding optimization techniques are rather time 
consuming and they require high computational power. 
The quadratic form of the Gibbs potentials corresponds to 
the assumption of a normal priori distribution. In such a 
case, there exists a highly computationally effective 
parametric dynamic programming procedure [3] based on 
recurrent decomposition of the initial problem of 
minimizing function of many variables into a succession 
of partial problems, each of which consists in minimizing 
a function of only one variable. The corresponding 
intermediate objective functions of one variable are called 
Bellman functions. Unfortunately, the assumption of a 
normal priori distribution does not possess any edge-
preserving properties. 
In this thesis, we proposed a new non-convex type of 
pair-wise potential functions that allows more flexibility 
to set a priori preferences, using different coefficients of 
penalty for various ranges of differences between the 
values of adjacent image elements [4]. 
256 Informatica 41 (2017) 255–256 P.C. Thang  
 
Moreover, a more specific case of signal processing has 
been considered [5]. In this case, pair-wise Gibbs 
potentials are selected as a minimum of a finite set of 
quadratic functions and an efficient optimization 
procedure is proposed for the Blake and Zisserman 
function, which is often used in edge-preserving 
procedures. 
To apply the similar procedure for image processing, 
image processing technique based on tree-like 
approximation of the pixel lattice has been used. We use a 
separate pixel neighborhood tree for each stem column, 
which is defined, nevertheless, on the whole pixel grid and 
has the same horizontal branches as the others. The 
resulting image processing procedure is aimed at finding 
optimal values only for the hidden variables at the stem 
nodes of each tree. For this combination of partial pixel 
neighborhood trees, the algorithm of finding the optimal 
values of the stem-node variables boils down to a 
combination of two usual dynamic programming 
procedures, each deal with single, respectively, horizontal 
and vertical image rows considered as signals on the one-
dimensional argument axis. First, such a one-dimensional 
procedure is applied to the horizontal rows independently. 
The result, in the form of so called marginal functions, 
should be stored in the memory. Then, the procedure is 
applied to the verticlal rows independently for each row 
with the only alteration: the respective marginal node 
functions, obtained at the first step, are taken instead of the 
image-dependent node functions [6].  
It can be proven that if the pair-wise Gibbs potentials 
are selected as a minimum of a finite set of quadratic 
functions, and node functions are in quadratic form, the 
Bellman function at each step of the dynamic 
programming procedure is also represented as a minimum 
of a finite set of quadratic functions.  
In the case of signal processing the number of 
quadratic functions that are required for representation of 
a Bellman function, generally, does not increase by more 
than one at each step [5]. It can be noted that not all the 
quadratic function will participate in forming of the final 
function, because their values are not minimum for any 
point. Such functions can be dropped using enough simple 
procedure that considers the position of the minimum 
point and the points of intersection of quadratic functions 
with each other.  
Nevertheless, then we apply this approach for 
processing two-dimensional data based on the tree 
approximation of lattice-like neighborhood graph, the 
number of quadratic functions in the representation of the 
Bellman functions may be too large. It leads to a lack of 
effective implementation of the proposed procedures. For 
effective implementation of the proposed procedures, a set 
of quadratic functions in the representation of the Bellman 
functions has been approximated with a quadratic 
function. However, it allows to keep only one quadratic 
function with the lowest value at the minimum among all 
quadratic functions in the representation of the Bellman 
function. It can lead to the loss of accuracy when the 
Bellman function has a few of minimums with similar 
values. Therefore, to save several minimums we divide the 
quadratic functions into the groups that close to each other. 
The number of groups is determined by the number of 
retained minimums. For the separation of functions into 
the group, one of the most popular clustering algorithms - 
k-means is used. This algorithm contains the benefits of 
speed and ease of implementation. To implement this 
method, it is necessary either to determine the set of 
features of a quadratic function, or to determine the 
distance between them. In this work, a variant without 
features with clustering set based on the distances between 
quadratic functions has been used [7].  
Application of the proposed approximation methods allow 
to receive some parametric dynamic programming 
procedures, that allow us to consider the existence of 
irregularities and discontinuities in the source data, while 
retaining high computing efficiency of the dynamic 
programming procedure. Furthermore, a comparison 
between the proposed dynamic programming procedures 
and related methods was performed. The results show that 
the proposed dynamic programming procedures obtain the 
high computational efficiency of dynamic programming 
for image processing. 
References 
[1] Pham Cong Thang (2016). Parametric Image 
Processing Procedures Based on Multi-Quadratic 
Dynamic Programming. Ph.D. dissertation, Tula 
State University, Russia, 140 pages. 
[2] Mottl V., et al. (1998). Optimization techniques on 
pixel neighborhood graphs for image processing. 
Graph-Based Representations in Pattern 
Recognition. Computing, Supplement 12. Springer–
Verlag/Wien, pp. 135-145. 
[3] Nikolova M., Michael K., and Tam C.P. (2010). Fast 
Nonconvex Nonsmooth Minimization Methods for 
Image Restoration and Reconstruction. IEEE 
Transactions on Image Processing, Vol. 19 (12), pp. 
3073-3088. 
[4] Pham C. T. and Kopylov A. V. (2015). Multi-
Quadratic Dynamic Programming Procedure of 
Edge–Preserving Denoising for Medical Images. Int. 
Arch. Photogramm. Remote Sens. Spatial Inf. Sci., 
XL-5/W6, рр. 101-06. 
[5] Kopylov A., et al. (2010). A Signal Processing 
Algorithm Based on Parametric Dynamic 
Programming. Lecture Notes in Computer Science, 
Vol. 6134, pp. 280-86. 
[6] Kopylov A.V. (2005). Parametric dynamic 
programming procedures for edge preserving in 
smoothing of signals and images. Pattern recognition 
and image analysis, Vol. 15, pp. 227-229. 
[7] Dvoenko S. D. (2009). Clustering Sets Based on 
Distances and Proximities between Its Elements. Sib. 
Zh. Ind. Mat., Vol. 12 (1), pp. 61–73. 
 
Informatica 41 (2017) 257
JOŽEF STEFAN INSTITUTE
Jožef Stefan (1835-1893) was one of the most prominent
physicists of the 19th century. Born to Slovene parents,
he obtained his Ph.D. at Vienna University, where he was
later Director of the Physics Institute, Vice-President of the
Vienna Academy of Sciences and a member of several sci-
entific institutions in Europe. Stefan explored many areas
in hydrodynamics, optics, acoustics, electricity, magnetism
and the kinetic theory of gases. Among other things, he
originated the law that the total radiation from a black
body is proportional to the 4th power of its absolute tem-
perature, known as the Stefan–Boltzmann law.
The Jožef Stefan Institute (JSI) is the leading indepen-
dent scientific research institution in Slovenia, covering a
broad spectrum of fundamental and applied research in the
fields of physics, chemistry and biochemistry, electronics
and information science, nuclear science technology, en-
ergy research and environmental science.
The Jožef Stefan Institute (JSI) is a research organisation
for pure and applied research in the natural sciences and
technology. Both are closely interconnected in research de-
partments composed of different task teams. Emphasis in
basic research is given to the development and education of
young scientists, while applied research and development
serve for the transfer of advanced knowledge, contributing
to the development of the national economy and society in
general.
At present the Institute, with a total of about 900 staff,
has 700 researchers, about 250 of whom are postgraduates,
around 500 of whom have doctorates (Ph.D.), and around
200 of whom have permanent professorships or temporary
teaching assignments at the Universities.
In view of its activities and status, the JSI plays the role
of a national institute, complementing the role of the uni-
versities and bridging the gap between basic science and
applications.
Research at the JSI includes the following major fields:
physics; chemistry; electronics, informatics and computer
sciences; biochemistry; ecology; reactor technology; ap-
plied mathematics. Most of the activities are more or
less closely connected to information sciences, in particu-
lar computer sciences, artificial intelligence, language and
speech technologies, computer-aided design, computer ar-
chitectures, biocybernetics and robotics, computer automa-
tion and control, professional electronics, digital communi-
cations and networks, and applied mathematics.
The Institute is located in Ljubljana, the capital of the in-
dependent state of Slovenia (or S♥nia). The capital today
is considered a crossroad between East, West and Mediter-
ranean Europe, offering excellent productive capabilities
and solid business opportunities, with strong international
connections. Ljubljana is connected to important centers
such as Prague, Budapest, Vienna, Zagreb, Milan, Rome,
Monaco, Nice, Bern and Munich, all within a radius of 600
km.
From the Jožef Stefan Institute, the Technology park
“Ljubljana” has been proposed as part of the national strat-
egy for technological development to foster synergies be-
tween research and industry, to promote joint ventures be-
tween university bodies, research institutes and innovative
industry, to act as an incubator for high-tech initiatives and
to accelerate the development cycle of innovative products.
Part of the Institute was reorganized into several high-
tech units supported by and connected within the Technol-
ogy park at the Jožef Stefan Institute, established as the
beginning of a regional Technology park "Ljubljana". The
project was developed at a particularly historical moment,
characterized by the process of state reorganisation, privati-
sation and private initiative. The national Technology Park
is a shareholding company hosting an independent venture-
capital institution.
The promoters and operational entities of the project are
the Republic of Slovenia, Ministry of Higher Education,
Science and Technology and the Jožef Stefan Institute. The
framework of the operation also includes the University of
Ljubljana, the National Institute of Chemistry, the Institute
for Electronics and Vacuum Technology and the Institute
for Materials and Construction Research among others. In
addition, the project is supported by the Ministry of the
Economy, the National Chamber of Economy and the City
of Ljubljana.
Jožef Stefan Institute
Jamova 39, 1000 Ljubljana, Slovenia
Tel.:+386 1 4773 900, Fax.:+386 1 251 93 85
WWW: http://www.ijs.si
E-mail: matjaz.gams@ijs.si
Public relations: Polona Strnad
Informatica 41 (2017)
INFORMATICA
AN INTERNATIONAL JOURNAL OF COMPUTING AND INFORMATICS
INVITATION, COOPERATION
Submissions and Refereeing
Please register as an author and submit a manuscript at:
http://www.informatica.si. At least two referees outside the au-
thor’s country will examine it, and they are invited to make as
many remarks as possible from typing errors to global philosoph-
ical disagreements. The chosen editor will send the author the
obtained reviews. If the paper is accepted, the editor will also
send an email to the managing editor. The executive board will
inform the author that the paper has been accepted, and the author
will send the paper to the managing editor. The paper will be pub-
lished within one year of receipt of email with the text in Infor-
matica MS Word format or Informatica LATEX format and figures
in .eps format. Style and examples of papers can be obtained from
http://www.informatica.si. Opinions, news, calls for conferences,
calls for papers, etc. should be sent directly to the managing edi-
tor.
SUBSCRIPTION
Please, complete the order form and send it to Dr. Drago Torkar,
Informatica, Institut Jožef Stefan, Jamova 39, 1000 Ljubljana,
Slovenia. E-mail: drago.torkar@ijs.si
Since 1977, Informatica has been a major Slovenian scientific
journal of computing and informatics, including telecommuni-
cations, automation and other related areas. In its 16th year
(more than twentythree years ago) it became truly international,
although it still remains connected to Central Europe. The ba-
sic aim of Informatica is to impose intellectual values (science,
engineering) in a distributed organisation.
Informatica is a journal primarily covering intelligent systems in
the European computer science, informatics and cognitive com-
munity; scientific and educational as well as technical, commer-
cial and industrial. Its basic aim is to enhance communications
between different European structures on the basis of equal rights
and international refereeing. It publishes scientific papers ac-
cepted by at least two referees outside the author’s country. In ad-
dition, it contains information about conferences, opinions, criti-
cal examinations of existing publications and news. Finally, major
practical achievements and innovations in the computer and infor-
mation industry are presented through commercial publications as
well as through independent evaluations.
Editing and refereeing are distributed. Each editor can conduct
the refereeing process by appointing two new referees or referees
from the Board of Referees or Editorial Board. Referees should
not be from the author’s country. If new referees are appointed,
their names will appear in the Refereeing Board.
Informatica web edition is free of charge and accessible at
http://www.informatica.si.
Informatica print edition is free of charge for major scientific, ed-
ucational and governmental institutions. Others should subscribe.
Informatica WWW:
http://www.informatica.si/
Referees from 2008 on:
A. Abraham, S. Abraham, R. Accornero, A. Adhikari, R. Ahmad, G. Alvarez, N. Anciaux, R. Arora, I. Awan, J.
Azimi, C. Badica, Z. Balogh, S. Banerjee, G. Barbier, A. Baruzzo, B. Batagelj, T. Beaubouef, N. Beaulieu, M. ter
Beek, P. Bellavista, K. Bilal, S. Bishop, J. Bodlaj, M. Bohanec, D. Bolme, Z. Bonikowski, B. Bošković, M. Botta,
P. Brazdil, J. Brest, J. Brichau, A. Brodnik, D. Brown, I. Bruha, M. Bruynooghe, W. Buntine, D.D. Burdescu, J.
Buys, X. Cai, Y. Cai, J.C. Cano, T. Cao, J.-V. Capella-Hernández, N. Carver, M. Cavazza, R. Ceylan, A. Chebotko,
I. Chekalov, J. Chen, L.-M. Cheng, G. Chiola, Y.-C. Chiou, I. Chorbev, S.R. Choudhary, S.S.M. Chow, K.R.
Chowdhury, V. Christlein, W. Chu, L. Chung, M. Ciglarič, J.-N. Colin, V. Cortellessa, J. Cui, P. Cui, Z. Cui, D.
Cutting, A. Cuzzocrea, V. Cvjetkovic, J. Cypryjanski, L. Čehovin, D. Čerepnalkoski, I. Čosić, G. Daniele, G.
Danoy, M. Dash, S. Datt, A. Datta, M.-Y. Day, F. Debili, C.J. Debono, J. Dedič, P. Degano, A. Dekdouk, H.
Demirel, B. Demoen, S. Dendamrongvit, T. Deng, A. Derezinska, J. Dezert, G. Dias, I. Dimitrovski, S. Dobrišek,
Q. Dou, J. Doumen, E. Dovgan, B. Dragovich, D. Drajic, O. Drbohlav, M. Drole, J. Dujmović, O. Ebers, J. Eder,
S. Elaluf-Calderwood, E. Engström, U. riza Erturk, A. Farago, C. Fei, L. Feng, Y.X. Feng, B. Filipič, I. Fister, I.
Fister Jr., D. Fišer, A. Flores, V.A. Fomichov, S. Forli, A. Freitas, J. Fridrich, S. Friedman, C. Fu, X. Fu, T.
Fujimoto, G. Fung, S. Gabrielli, D. Galindo, A. Gambarara, M. Gams, M. Ganzha, J. Garbajosa, R. Gennari, G.
Georgeson, N. Gligorić, S. Goel, G.H. Gonnet, D.S. Goodsell, S. Gordillo, J. Gore, M. Grčar, M. Grgurović, D.
Grosse, Z.-H. Guan, D. Gubiani, M. Guid, C. Guo, B. Gupta, M. Gusev, M. Hahsler, Z. Haiping, A. Hameed, C.
Hamzaçebi, Q.-L. Han, H. Hanping, T. Härder, J.N. Hatzopoulos, S. Hazelhurst, K. Hempstalk, J.M.G. Hidalgo, J.
Hodgson, M. Holbl, M.P. Hong, G. Howells, M. Hu, J. Hyvärinen, D. Ienco, B. Ionescu, R. Irfan, N. Jaisankar, D.
Jakobović, K. Jassem, I. Jawhar, Y. Jia, T. Jin, I. Jureta, Ð. Juričić, S. K, S. Kalajdziski, Y. Kalantidis, B. Kaluža,
D. Kanellopoulos, R. Kapoor, D. Karapetyan, A. Kassler, D.S. Katz, A. Kaveh, S.U. Khan, M. Khattak, V.
Khomenko, E.S. Khorasani, I. Kitanovski, D. Kocev, J. Kocijan, J. Kollár, A. Kontostathis, P. Korošec, A.
Koschmider, D. Košir, J. Kovač, A. Krajnc, M. Krevs, J. Krogstie, P. Krsek, M. Kubat, M. Kukar, A. Kulis, A.P.S.
Kumar, H. Kwaśnicka, W.K. Lai, C.-S. Laih, K.-Y. Lam, N. Landwehr, J. Lanir, A. Lavrov, M. Layouni, G. Leban,
A. Lee, Y.-C. Lee, U. Legat, A. Leonardis, G. Li, G.-Z. Li, J. Li, X. Li, X. Li, Y. Li, Y. Li, S. Lian, L. Liao, C. Lim,
J.-C. Lin, H. Liu, J. Liu, P. Liu, X. Liu, X. Liu, F. Logist, S. Loskovska, H. Lu, Z. Lu, X. Luo, M. Luštrek, I.V.
Lyustig, S.A. Madani, M. Mahoney, S.U.R. Malik, Y. Marinakis, D. Marinčič, J. Marques-Silva, A. Martin, D.
Marwede, M. Matijašević, T. Matsui, L. McMillan, A. McPherson, A. McPherson, Z. Meng, M.C. Mihaescu, V.
Milea, N. Min-Allah, E. Minisci, V. Mišić, A.-H. Mogos, P. Mohapatra, D.D. Monica, A. Montanari, A. Moroni, J.
Mosegaard, M. Moškon, L. de M. Mourelle, H. Moustafa, M. Možina, M. Mrak, Y. Mu, J. Mula, D. Nagamalai,
M. Di Natale, A. Navarra, P. Navrat, N. Nedjah, R. Nejabati, W. Ng, Z. Ni, E.S. Nielsen, O. Nouali, F. Novak, B.
Novikov, P. Nurmi, D. Obrul, B. Oliboni, X. Pan, M. Pančur, W. Pang, G. Papa, M. Paprzycki, M. Paralič, B.-K.
Park, P. Patel, T.B. Pedersen, Z. Peng, R.G. Pensa, J. Perš, D. Petcu, B. Petelin, M. Petkovšek, D. Pevec, M.
Pičulin, R. Piltaver, E. Pirogova, V. Podpečan, M. Polo, V. Pomponiu, E. Popescu, D. Poshyvanyk, B. Potočnik,
R.J. Povinelli, S.R.M. Prasanna, K. Pripužić, G. Puppis, H. Qian, Y. Qian, L. Qiao, C. Qin, J. Que, J.-J.
Quisquater, C. Rafe, S. Rahimi, V. Rajkovič, D. Raković, J. Ramaekers, J. Ramon, R. Ravnik, Y. Reddy, W.
Reimche, H. Rezankova, D. Rispoli, B. Ristevski, B. Robič, J.A. Rodriguez-Aguilar, P. Rohatgi, W. Rossak, I.
Rožanc, J. Rupnik, S.B. Sadkhan, K. Saeed, M. Saeki, K.S.M. Sahari, C. Sakharwade, E. Sakkopoulos, P. Sala,
M.H. Samadzadeh, J.S. Sandhu, P. Scaglioso, V. Schau, W. Schempp, J. Seberry, A. Senanayake, M. Senobari,
T.C. Seong, S. Shamala, c. shi, Z. Shi, L. Shiguo, N. Shilov, Z.-E.H. Slimane, F. Smith, H. Sneed, P. Sokolowski,
T. Song, A. Soppera, A. Sorniotti, M. Stajdohar, L. Stanescu, D. Strnad, X. Sun, L. Šajn, R. Šenkeřík, M.R.
Šikonja, J. Šilc, I. Škrjanc, T. Štajner, B. Šter, V. Štruc, H. Takizawa, C. Talcott, N. Tomasev, D. Torkar, S.
Torrente, M. Trampuš, C. Tranoris, K. Trojacanec, M. Tschierschke, F. De Turck, J. Twycross, N. Tziritas, W.
Vanhoof, P. Vateekul, L.A. Vese, A. Visconti, B. Vlaovič, V. Vojisavljević, M. Vozalis, P. Vračar, V. Vranić, C.-H.
Wang, H. Wang, H. Wang, H. Wang, S. Wang, X.-F. Wang, X. Wang, Y. Wang, A. Wasilewska, S. Wenzel, V.
Wickramasinghe, J. Wong, S. Wrobel, K. Wrona, B. Wu, L. Xiang, Y. Xiang, D. Xiao, F. Xie, L. Xie, Z. Xing, H.
Yang, X. Yang, N.Y. Yen, C. Yong-Sheng, J.J. You, G. Yu, X. Zabulis, A. Zainal, A. Zamuda, M. Zand, Z. Zhang,
Z. Zhao, D. Zheng, J. Zheng, X. Zheng, Z.-H. Zhou, F. Zhuang, A. Zimmermann, M.J. Zuo, B. Zupan, M.
Zuqiang, B. Žalik, J. Žižka,
Informatica
An International Journal of Computing and Informatics
Web edition of Informatica may be accessed at: http://www.informatica.si.
Subscription Information Informatica (ISSN 0350-5596) is published four times a year in Spring, Summer,
Autumn, and Winter (4 issues per year) by the Slovene Society Informatika, Litostrojska cesta 54, 1000 Ljubljana,
Slovenia.
The subscription rate for 2017 (Volume 41) is
– 60 EUR for institutions,
– 30 EUR for individuals, and
– 15 EUR for students
Claims for missing issues will be honored free of charge within six months after the publication date of the issue.
Typesetting: Borut Žnidar.
Printing: ABO grafika d.o.o., Ob železnici 16, 1000 Ljubljana.
Orders may be placed by email (drago.torkar@ijs.si), telephone (+386 1 477 3900) or fax (+386 1 251 93 85). The
payment should be made to our bank account no.: 02083-0013014662 at NLB d.d., 1520 Ljubljana, Trg republike
2, Slovenija, IBAN no.: SI56020830013014662, SWIFT Code: LJBASI2X.
Informatica is published by Slovene Society Informatika (president Niko Schlamberger) in cooperation with the
following societies (and contact persons):
Slovene Society for Pattern Recognition (Simon Dobrišek)
Slovenian Artificial Intelligence Society (Mitja Luštrek)
Cognitive Science Society (Olga Markič)
Slovenian Society of Mathematicians, Physicists and Astronomers (Marej Brešar)
Automatic Control Society of Slovenia (Nenad Muškinja)
Slovenian Association of Technical and Natural Sciences / Engineering Academy of Slovenia (Stane Pejovnik)
ACM Slovenia (Matjaž Gams)
Informatica is financially supported by the Slovenian research agency from the Call for co-financing of scientific
periodical publications.
Informatica is surveyed by: ACM Digital Library, Citeseer, COBISS, Compendex, Computer & Information
Systems Abstracts, Computer Database, Computer Science Index, Current Mathematical Publications, DBLP
Computer Science Bibliography, Directory of Open Access Journals, InfoTrac OneFile, Inspec, Linguistic and
Language Behaviour Abstracts, Mathematical Reviews, MatSciNet, MatSci on SilverPlatter, Scopus, Zentralblatt
Math
Volume 41 Number 2 June 2017 ISSN 0350-5596
Editors’ Introduction to the Special Issue on
"Information and Communication Technology"
L.D. Raedt, Y. Deville,
M. Bui, D.-L. Truong
131
Improvement of Person Tracking Accuracy in
Camera Network by Fusing WiFi and Visual
Information
T.T.T. Pham, T.-L. Le,
T.-K. Dao
133
Persons-In-Places: a Deep Features Based Approach
for Searching a Specific Person in a Specific
Location
V.-T. Nguyen, T.D. Ngo,
M.-T. Tran, D.-D. Le,
D.A. Duong
149
Another Look at Radial Visualization for
Class-preserving Multivariate Data Visualization
V.L. Tran 159
Emotional Contagion Model for Group Evacuation
Simulation
X.H. Ta, B. Gaudou,
D. Longin, T.V. Ho
169
Key-Value-Links: A New Data Model for
Developing Efficient RDMA-Based In-Memory
Stores
H.D. Nguyen, T.D. Vu,
D.H. Nguyen, M.D. Le,
T.H. Ho, T.V. Pham
183
Defense Strategies against Byzantine Attacks in a
Consensus-Based Network Intrusion Detection
System
M. Toulouse, H. Le,
C.V. Phung, D. Hock
193
End of Special Issue / Start of normal papers
Individual Classification: an Ontological Fuzzy
Based Approach
A. Djellal, Z. Boufaida 209
SK-languages as a Powerful and Flexible Semantic
Formalism for the Systems of Cross-Lingual
Intelligent Information Access
V.A. Fomichov 221
Formal Development of Multi-Agent Systems with
FPASSI: Towards Formalizing PASSI Methodology
using Rewriting Logic
M. Mazouz, F. Mokhati,
M. Badri
233
Application for Sexually Transmitted Infection Risk
Assessment
A. Ajanović, J. Konda,
G. Fele-Žorž, A. Gradišek,
M. Gams, A. Peterlin,
K. Počivavšek, M. Matičič
253
Image Processing Procedures Based on
Multi-Quadratic Dynamic Programming
C.P. Thang 255
Informatica 41 (2017) Number 2, pp. 131–257