Volume 41 Number 2 June 2017 Special Issue: Information and Communication Technology Guest Editors: Luc De Raedt Yves Deville Marc Bui Dieu-Linh Truong 1977 Editorial Boards Informatica is a journal primarily covering intelligent systems in the European computer science, informatics and cognitive com- munity; scientific and educational as well as technical, commer- cial and industrial. Its basic aim is to enhance communications between different European structures on the basis of equal rights and international refereeing. It publishes scientific papers ac- cepted by at least two referees outside the author’s country. In ad- dition, it contains information about conferences, opinions, criti- cal examinations of existing publications and news. Finally, major practical achievements and innovations in the computer and infor- mation industry are presented through commercial publications as well as through independent evaluations. Editing and refereeing are distributed. Each editor from the Editorial Board can conduct the refereeing process by appointing two new referees or referees from the Board of Referees or Edi- torial Board. Referees should not be from the author’s country. If new referees are appointed, their names will appear in the list of referees. Each paper bears the name of the editor who appointed the referees. Each editor can propose new members for the Edi- torial Board or referees. Editors and referees inactive for a longer period can be automatically replaced. Changes in the Editorial Board are confirmed by the Executive Editors. The coordination necessary is made through the Executive Edi- tors who examine the reviews, sort the accepted articles and main- tain appropriate international distribution. The Executive Board is appointed by the Society Informatika. Informatica is partially supported by the Slovenian Ministry of Higher Education, Sci- ence and Technology. Each author is guaranteed to receive the reviews of his article. When accepted, publication in Informatica is guaranteed in less than one year after the Executive Editors receive the corrected version of the article. Executive Editor – Editor in Chief Matjaž Gams Jamova 39, 1000 Ljubljana, Slovenia Phone: +386 1 4773 900, Fax: +386 1 251 93 85 matjaz.gams@ijs.si http://dis.ijs.si/mezi/matjaz.html Editor Emeritus Anton P. Železnikar Volaričeva 8, Ljubljana, Slovenia s51em@lea.hamradio.si http://lea.hamradio.si/˜s51em/ Executive Associate Editor - Deputy Managing Editor Mitja Luštrek, Jožef Stefan Institute mitja.lustrek@ijs.si Executive Associate Editor - Technical Editor Drago Torkar, Jožef Stefan Institute Jamova 39, 1000 Ljubljana, Slovenia Phone: +386 1 4773 900, Fax: +386 1 251 93 85 drago.torkar@ijs.si Contact Associate Editors Europe, Africa: Matjaz Gams N. and S. America: Shahram Rahimi Asia, Australia: Ling Feng Overview papers: Maria Ganzha, Wiesław Pawłowski, Aleksander Denisiuk Editorial Board Juan Carlos Augusto (Argentina) Vladimir Batagelj (Slovenia) Francesco Bergadano (Italy) Marco Botta (Italy) Pavel Brazdil (Portugal) Andrej Brodnik (Slovenia) Ivan Bruha (Canada) Wray Buntine (Finland) Zhihua Cui (China) Aleksander Denisiuk (Poland) Hubert L. Dreyfus (USA) Jozo Dujmović (USA) Johann Eder (Austria) George Eleftherakis (Greece) Ling Feng (China) Vladimir A. Fomichov (Russia) Maria Ganzha (Poland) Sumit Goyal (India) Marjan Gušev (Macedonia) N. Jaisankar (India) Dariusz Jacek Jakóbczak (Poland) Dimitris Kanellopoulos (Greece) Samee Ullah Khan (USA) Hiroaki Kitano (Japan) Igor Kononenko (Slovenia) Miroslav Kubat (USA) Ante Lauc (Croatia) Jadran Lenarčič (Slovenia) Shiguo Lian (China) Suzana Loskovska (Macedonia) Ramon L. de Mantaras (Spain) Natividad Martínez Madrid (Germany) Sando Martinčić-Ipišić (Croatia) Angelo Montanari (Italy) Pavol Návrat (Slovakia) Jerzy R. Nawrocki (Poland) Nadia Nedjah (Brasil) Franc Novak (Slovenia) Marcin Paprzycki (USA/Poland) Wiesław Pawłowski (Poland) Ivana Podnar Žarko (Croatia) Karl H. Pribram (USA) Luc De Raedt (Belgium) Shahram Rahimi (USA) Dejan Raković (Serbia) Jean Ramaekers (Belgium) Wilhelm Rossak (Germany) Ivan Rozman (Slovenia) Sugata Sanyal (India) Walter Schempp (Germany) Johannes Schwinn (Germany) Zhongzhi Shi (China) Oliviero Stock (Italy) Robert Trappl (Austria) Terry Winograd (USA) Stefan Wrobel (Germany) Konrad Wrona (France) Xindong Wu (USA) Yudong Zhang (China) Rushan Ziatdinov (Russia & Turkey) Informatica 41 (2017) 131–131 131 Editors' Introduction to the Special Issue on ‟Information and Communication Technology” Since 2010, Symposium on Information and Communication Technology-SoICT has been organised annually. The symposium provides an academic forum for researchers to share their latest research findings and to identify future challenges in computer science. The best papers from SoICT 2015 have been extended and published in the Special issue “SoICT 2015” of Informatica Journal, Vol.40, No.2 (2016). In 2016, SoICT was held in Hochiminh city, Vietnam, during December 8-9th. The symposium covered four major areas of research including Artificial Intelligence and Big Data, Information Networks and Communication Systems, Human-Computer Interaction, Software Engineering and Applied Computing. In 130 submissions from 20 countries, 58 papers were accepted for presentation at SoICT’2016. Among them, 6 papers were carefully selected, after further extension and additional reviews, for inclusion in this special issue. Paper “Improvement of Person Tracking Accuracy in Camera Network by Fusing WiFi and Visual Information” by Thi Thanh Thuy Pham, Thi-Lan Le and Trung-Kien Dao addresses the problem of person tracking in camera network. The authors assign the trajectory by person identity (ID) determined at each video frame. In order to improve the accuracy of vision- based person tracking, authors propose a fusion scheme of WiFi and visual signals for person tracking. The fusion method allows tracking by identification in non- overlapping cameras, with clear identity information taken from WiFi adapter. Paper “Persons-In-Places: A Deep Features Based Approach For Searching A Specific Person In A Specific Location” by Vinh-Tiep Nguyen, Thanh Duc Ngo, Minh- Triet Tran, Duy-Dinh Le and Duc Anh Duong considers the problem of video retrieval with complex queries which simultaneously covers person and location information. Authors introduce a framework to leverage Bag-Of-Visual-Words (BOW) model and deep features for person-place video retrieval. Research in the paper “Another Look at Radial Visualization for Class-preserving Multivariate Data Visualization” was conducted by Van Long Tran. Radial visualization is one of common information visualization concepts for visualizing multivariate data. However, radial visualization may display different information about structures of multivariate data. For example, all points that are multiplicatives of given points may map to the same point in the visual space. An optimal layout of radial visualization is usually found by defining a suitable order of data dimensions on the unit circle. In this paper, author proposes a novel method that improves the radial visualization layout for cluster preservation of multivariate data. Paper “Key-Value-Links: A New Data Model for Developing Efficient RDMA-Based In-Memory Stores” by Hai Duc Nguyen, The De Vu, Duc Hieu Nguyen, Minh Duc Le, Tien Hai Ho and Tran Vu Pham proposes a new data model, named Key-Value-Links (KVL), to improve in-memory store using RDMA. The KVL data model is essentially a key-value model with several extensions. The model named KELI. The results of experiments on real-life workload indicate that KELI, without being applied much optimization, easily outperform Memcached, a popular in-memory key-value store, in many cases. Paper “Defense Strategies against Byzantine Attacks in a Consensus-Based Network Intrusion Detection System” by Michel Toulouse, Hai Le, Cao Vien Phung and Denis Hock is interested in a security problem. Although the purpose of Network Intrusion Detection System (NIDS) is to monitor network traffic such as to detect malicious usages of network facilities, NIDS can itself be attacked. The paper investigates such vulnerabilities in a recent consensus-based NIDS proposal. It is known that consensus algorithms are not resilient to compromised nodes sharing falsified information, i.e. they can be the targets of Byzantine attacks. The paper proposes two different strategies aiming at identifying compromised NIDS modules sharing falsified information. Also, a simple approach is proposed to isolate compromised modules, returning the NIDS into a non-compromised state. Validations of the defense strategies are provided through several simulations of Distributed Denial of Service attacks using the NSL-KDD data set. Paper “Emotional contagion model for group evacuation simulation” by Xuan Hien Ta, Benoit Gaudou, Dominique Longin and Tuong Vinh Ho focuses on fear-related emotions and their positive impact on the survival capabilities of human beings in case of crisis situations. Authors proposed a new model of emotional contagion based on some main findings in social psychology. This model was formalized mathematically, implemented and tested in the GAMA agent-based simulation platform in the context of evacuation simulation. Authors assessed experimentally the impact of three factors (emotion decay, environment, neighbors’ emotional contagion) on emotion dynamics at individual and group levels. Luc De Raedt Marc Bui Yves Deville Dieu-Linh Truong 132 Informatica 41 (2017) 131–131 Y. Deville et al. Informatica 41 (2017) 133–148 133 Improvement of Person Tracking Accuracy in Camera Network by Fusing WiFi and Visual Information Thi Thanh Thuy Pham Academy of People Security, Hanoi, Vietnam E-mail: thanh-thuy.pham@mica.edu.vn Thi-Lan Le and Trung-Kien Dao MICA International Research Institute, Hanoi University of Science and Technology (HUST - CNRS/UMI-2954 - Grenoble INP), Hanoi, Vietnam E-mail: {thi-lan.le, trung-kien.dao}@mica.edu.vn Keywords: camera, WiFi, fusion method, person tracking by identification Received: March 29, 2017 Person tracking in camera network is still an open subject nowadays. The main challenge for this problem is how to link exactly individual trajectories when people move in a camera FOV (Field of View) or switch to other ones. This refers to solve the problem of person re-identification (Re-ID) in tracking process. A popular method for this is assigning the current position with the previous one based on the minimum distance between them. This is called as person identification by tracking. In this work, we approach tracking by identification, which means the trajectory assignment is done by the person identity (ID) determined at each video frame. In order to improve the accuracy of vision-based person tracking, we focus on accuracy enhancement for person identification by adding ID of the WiFi-enabled device held by each person. A fusion scheme of WiFi and visual signals is proposed in this work for person tracking. An optimal assignment and Kalman filter are used in this combination to assign the position observations and predicted states from camera and WiFi systems. The correction step of Kalman filter is then applied for each tracker to give out state estimations of locations. The fusion method allows tracking by identification in non-overlapping cameras, with clear identity information taken from WiFi adapter. The evaluation on a multi-model dataset show outperforming tracking results of the proposed fusion method in comparison with vision-based only method. Povzetek: Opisana je metoda sledenja osebam preko kamer s pomočjo zlivanja podatkov. 1 Introduction There have been several attempts to combine camera and WiFi systems for indoor person tracking. A multi-modal system is reported in [1] using WiFi-based localization and tracking by stationary cameras. The combined sys- tem focuses on improving the positioning accuracy and confidence at room level. According to the authors’ as- sessments, camera-based localization achieves higher posi- tioning accuracy than WiFi-based system. However, blind points, occlusions and person identification are much more challenging for camera systems. WiFi systems give clearer identity information because each mobile device has a unique MAC address, but considered targets are required to hold mobile devices during tracking. In this work, RSSI property and fingerprinting method are used in WiFi sys- tem to locate mobile targets. In camera-based system, fore- ground segmentation is done by GMM (Gaussian Mixture Model) method. The region which contains person feet is then extracted from foreground and projected on the floor plane. Gaussian kernels are used to model the foot region. Each single module is executed depending on the avail- ability of each sensor information. When both of them appear, a combined Bayes model with the corresponding confidence weights is done. The authors in [2] reported another approach for object localization fusing images and WiFi signals. The system can be deployed in both indoor and outdoor environments. The algorithm of PlaceEngine [3] and the modified ver- sion of the Centroid algorithm [4] are used in this work for WiFi-based localization. The mixture of observation model based on Particle filter allows continuously track targets even in case they are occluded by other objects or temporarily disappear when moving in blind areas among disjoint cameras. In [5], the authors proposed to combine RGB data with wireless signals emitted from a person’s cell phone to lo- cate and track individuals. The authors considered a unique MAC address of mobile device as a reliable cue of person’s ID. Wireless data is efficiently embedded in RGB data as a ring image, which captures radius estimation, error bounds, and confidence level (noise detection) for each antenna. In 134 Informatica 41 (2017) 133–148 T.T.T. Pham et al. order to improve tracking algorithm, each MAC address is assigned to an observed tracklet and bipartite graph is proposed for data association problem. The testing results proved that performance of person localization and track- ing can be improved by fusion RGB and wireless data. In this paper, we propose a fusion method of WiFi and camera for person localization and Re-ID in a camera net- work. It allows to improve the vision-based person tracking in not only one camera FOV, but also among different cam- era FOVs by using the unique ID information from WiFi hardware. The rest of paper is organized as follows. In Section II, a framework for multi-modal person tracking by fusion of WiFi and camera is presented. Section III and Section IV indicate each single person localization system based on visual and WiFi signals, respectively. A combined method of WiFi and camera is discussed in Section V. The compar- ative evaluations are shown in Section VI. Conclusion and future directions will be finally denoted in the last section. 2 Framework Figure 1 shows the fusion framework for person localiza- tion and Re-ID in non-overlapping camera networks. The combined model is processed in the real scenario of a fully- automated person surveillance system, which is reported in our previous work [6]. In this system, the camera FOVs are covered by WiFi range. This means WiFi signals are always available for person localization, but disjointed camera shot areas cause intermittent positioning for vision-based system. In each camera FOV, person localization is done by three phases, i.e., human detection, tracking and localization to output person ID j by camera C (IDCj ) and the corresponding po- sition (PCj ). Because WiFi range covers the camera FOVs, so in each camera FOV, the vision-based positioning result of person j will be combined with WiFi-based localization result of person i (PWi , ID W i ) by a fusion algorithm in order to make effective decisions about position and iden- tity of person in environments. When people switch from one camera FOV to another, they will be re-identified to update the ID for each individual trajectory. The trajecto- ries through the cameras will be also linked to show the entire route in the environment. Addtionally, in the fusion model, WiFi-based localization results are used to activate the cameras which are in the positioning range returned by a WiFi-based system. The proposed mixture model allows to continuously localize and identify person moving in non- overlapping camera networks. In the proposed system, the positioning processes are executed independently from each single model. The lo- cations calculated from both models of WiFi and camera are shown on the uniform coordinate system of a 2D floor map. A fusion algorithm for person localization and Re-ID is proposed. It is based on Kalman filter model, together with an optimal assignment of estimated and observed lo- cations from both models. The details for each single per- son localization system and the proposed fusion algorithm will be shown in the next sections. 3 Vision-based person localization and Re-ID Camera-based person localization and Re-ID is a process of finding the positions and the corresponding ID of a person when he/she moves in one camera FOV or switches from one camera FOV to others in camera networks. It refers to linking person trajectories in the frame sequences captured from multiple cameras. These trajectories are then trans- formed to real-world coordinate system by a process called 3D localization. 3.1 Person localization A camera-based person localization system includes three main steps of human detection, tracking and 3D localiza- tion. For each camera FOV, human detection is executed at each frame to output the human ROI (Region of Interest), which is presented by a rectangular bounding box contain- ing the person. The person position on image is defined in this work as a middle point of the rectangle’s bottom edge which has contact with the floor plane (see Figure 2). It is called a FootPoint position. Human tracking in a frame sequence captured from a camera FOV is consid- ered as FootPoint tracking. In case of multi-person track- ing, each detected FootPoint has to be assigned with the corresponding ID. 3D person localization is done by trans- forming FootPoint positions to real world locations on a predefined 2D coordinate system of the floor plane where the person moves. First, a combination of HOG-SVM and GMM back- ground subtraction techniques [6] is applied for human de- tection. In order to improve the performance of human de- tection, shadow removal method in [6] is used as a post- processing step for human detection. Second, in each camera FOV, based on the detection re- sults, FootPoint tracking is done by utilizing Kalman Filter and Hungarian data association algorithm [7] to improve the performance of track association. For each camera, a grid of the floor plane where people move in the camera FOV, namely detection grid (see Figure 3), is defined as a function G(x, y): G(x, y) = { 1 if (x, y) ∈ CT ; 0 otherwise. where CT is a threshold region bounded by a contour line which is the border of camera FOV on the floor plane. As each detected person is represented by a FootPoint posi- tion, so a FootPoint position can belong to one of the posi- tions of the detection grid where G(x, y)=1. Let (pxt, pyt) denote the pixel coordinates of a FootPoint position at time t in the grid, (mxt,myt) the pixel coordinates of a Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 135 Vision-based Localization Input frames WiFi signals (  ,  ) (, ) WiFi-based Localization (  ,  ) Fusion Figure 1: Framework of person localization and Re-ID using the combined system of WiFi and camera. Figure 2: Examples of tracking lines which are formed by linking trajectories of corresponding FootPoint positions. measurement in the grid, so that G(mxt,myt) = 1, and (vxt, vyt) velocity values at time t in x and y direction. The state vector xt of an user at time frame t can be characterized by the corresponding FootPoint location, and measurement vector zt are defined as: xt = (pxt, pyt, vxt, vyt) (1) zt = (mxt,myt) (2) Using the state and measurement update equations of Kalman filter, in conjunction with the initial conditions, at each time frame, the state vector and its covariance matrix are estimated. The 2D spatial coordinates of an estimated state (p̂x, p̂y) (an estimated FootPoint position) refer to the position p of the user u. In multi-person tracking, a separate Kalman filter is ini- tialized and models each person’s trajectory. A set Ut of individuals and a setMt of measurements at time frame t are defined as: Ut = {u1, u2, .., uN} (3) Mt = {m1,m2, ..,mL} (4) withN is the number of people need to be tracked or track- ers, and L is the number of available measurements at time Figure 3: Example of a grid map and threshold region bounded by a contour line. t. In order to assign a person i to a measurement j, the Hungarian method is used. Third, in order to locate people in real world coordinate system, we define a 2D map of the floor plane on which people move. This map contains all considered camera FOVs on the floor plane. We then calculate the coordi- nates of each FootPoint position on the 2D map on the ba- sis of camera calibration and hormography transform [8]. The trajectories for each person through cameras are then linked by a method of wrapping multiple camera FOVs us- ing a stereo calibration technique [9]. 3.2 Person re-ID In this paper, the person Re-ID problem is solved in the sce- nario of tracking by identification. This means that at each detected FootPoint position, we extract the human ROI, and a feature descriptor is built on this region. In this work, a robust KDES descriptor (Kernel Descriptor) which is pro- posed in our previous work [6], and an SVM classifier are used for person Re-ID in camera networks. The basic idea of KDES descriptor is to compute the approximate explicit feature map for kernel match function (see Figure 4). In other words, the kernel match functions are approximated 136 Informatica 41 (2017) 133–148 T.T.T. Pham et al. by explicit feature maps. This enables efficient learning methods for linear kernels to be applied to the non-linear kernels. Given a match kernel function k(x, y), the feature map ϕ(.) for the kernel k(x, y) is a function mapping a vector x into a feature space so that k(x, y) = ϕ(x)>ϕ(y). Given a set of basis vectors B = {ϕ(vi)}Di=1, the approxi- mation of feature map ϕ(x) can be: φ(x) = GkB(x) (5) where G>G = K−1BB , KBB is a D × D matrix with {KBB}ij = k(vi, vj), and kB is a D × 1 vector with {kB}i = k(x, vi). Kernel Trick KDES ),( yxk )()(),( yxyxk T φφ≈)()( yx φϕ ≈ )( xx ϕ→ Figure 4: The basic idea of representation based on kernel methods. Similar to [10], three match kernel functions for gradi- ent, color and shape are built from different pixel attributes of gradient, color and local binary pattern (LBP). For each match kernel, feature extraction is done at three levels: pixel, patch and whole detected human region. 4 WiFi-based person localization For WiFi, RSSI is the most popular attribute used in lo- calization. However, the localization performance depends much on how well we can model the relationship between RSSI and the distance. Two main approaches have been proposed to solve this: pass-loss/radio propagation model [12, 13] and fingerprinting method [14]. The first one is still an open subject, because it is not easy to have an op- timal model for relationship between RSSI and distance. The second one is time and workforce consuming but it is effective for localization, especially when the probabilistic methods are applied. In this work, both of radio propagation model and fin- gerprinting method for WiFi-based localization are ap- proached. A probabilistic propagation model (PPM) in [11], together with a new-defined radio map in fingerprint- ing database are used. The radio propagation model reflects the complex nature of indoor environments by taking into account the obstacles, such as walls and floors to model the relationship between RSSI value and the distance to a reference point (RP). The model is based on the empirical equation of radio-frequency signal strength in indoor envi- ronments and its uncertainty is considered by probabilistic characteristics. An optimization process based on genetic algorithm is also applied to tune system parameters for best fitting with the devices in use. Based on the probabilis- tic propagation model, the distance between a mobile user and APs is calculated. In fingerprinting database, a new ra- dio map of distance features instead of RSSI values is de- fined in order to make the radio map more reliable and sta- ble, with lower cost for setting and updating. Additionally, KNN matching method is applied with an additional coef- ficient reflecting temporal changes of fingerprinting data in environments. The flowchart of the proposed WiFi-based person localization system is illustrated in Figure 5, with two main phases of training and testing. The first phase is Radio Map RP Coordinates Fingerprint Database RSSI values PPM Distance values Offline training phase Online testing phase SERVER Distance values Mobile User Position RSSI PPM KNN matching Figure 5: Diagram of the proposed WiFi-based object lo- calization system. processed off-line with radio maps are constructed to make fingerprint database. Normally, a radio map contains RP coordinates and corresponding RSSI values from available APs. However, in our proposed system, RSSI values are replaced by distance values. A distance value is defined as the distance di(L) from the ith RP to the Lth AP in range (see Figure 6) which is calculated from RSSI observations by using the PPM model. In the testing phase, a mobile de-           AP2 RP3RP2RP1 AP3AP1 Figure 6: An example of radio map with a set of pi RPs and the distance values di(L) from each RP to L APs. vice continuously scan signals from nearby APs and sends corresponding RSSI values to a server. These values are then transformed to distance values by a proposed proba- bilistic propagation model. Distance matchings are done with fingerprint database by methods of KNN to find the best candidates for mobile user location. Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 137 4.1 Probabilistic propagation model The probabilistic propagation model which is formed by a deterministic model in Eq. 6 and a probabilistic model. P = P0 − 10nlog( r r0 )− kd nw∑ i=1 di cosβi (6) where nw is the number of walls and floors in the middle of the AP and the receiver, di is the thickness of the ith wall/floor, βi is the angle of arrival corresponding to the ith wall/floor, and kd is an attenuation factor per wall/floor thickness unit, as illustrated in Figure 7. Figure 7: WiFi signal attenuation through walls/floors. The deterministic model in Eq. 6 does not consider the uncertainty of RSSI values at a distance, so a probabilistic model (Eq. 7) is proposed. In reality, given RSSI P , the distance r might not be exactly the value calculated from Equation 6, but it is within a range around this value, which is denoted by r̄. To be more precise, r̄ will be the nominate value of the distance r with the highest probability. Given a RSSI P , the distribution of the distance is assumed to follow a normal (or Gaussian) distribution with median r̄: ρ(r, P ) = Pr(r|P ) = 1 σ √ 2π e −(r−r̄)2 2σ2 (7) where σ is a standard deviation, which is also a function of P . For simplicity, σ is assumed to be related to r̄ by a linear relation: σ = kσr̄ (8) In the proposed probabilistic propagation model, there are totally five parameters to be determined: P0, r0, n, kd and kσ. Excepting k0, other parameters can be estimated separately from individual measurements in a straightfor- ward manner. However, the values of these parameters can be slightly affected by the assumptions taken in the RF (Radio Frequency) propagation model. For this reason, a genetic algorithm (GA) [15] is used to find the optimal parameter set, all together. Genetic algorithms are global search techniques modeled after the natural genetic mech- anism to find approximate or exact solutions for optimiza- tion and search problems. In a GA, each parameter to be optimized is represented by a gene. Moreover, each indi- vidual is characterized by a chromosome, which is actually the above set of parameters awaiting optimization. To as- sess the quality of an individual, a fitness function (objec- tive function, or cost function) must be defined. For the localization module, the fitness function Ψ is defined as the root mean square of the localization error. Ψ = ( 1 N N∑ i=1 (x̂i−xi)2+(ŷi−yi)2+(ẑi−zi)2)1/2 (9) where N is the number of measurements, (xi, yi, zi) and (x̂i, ŷi, ẑi) are the real and the estimated positions, respec- tively. 4.2 Fingerprinting database and KNN matching Normally, a radio map in fingerprinting method is defined as follows: R , {(pi ,F(pi)) | i = 1 , ..,N } (10) where pi , [px py pz]T is real world coordinates of the ith RP and F(pi) , [ri(1) ,..,ri(n)] is the fingerprinting ma- trix, with n being the number of training samples at each RP. The vector ri(t),[r1i (t), .., r L i (t)] T contains RSSI values that are scanned from L APs at time t and the loca- tion pi. By using distance feature instead of RSSI, the radio map in Equation 10 then has a fingerprinting matrix F(pi) , [di(1) ,..,di(n)], with a vector di(t),[d1i (t), .., d L i (t)] contains distance samples di from the ith RP to L APs. This results in a reliable and stable radio map even in case some APs may be inactive at a certain point of time. Fur- thermore, the cost for setting and updating the radio map is much lower than using RSSI as usual. It is only rebuilt when we deploy new APs and RPs or discard them from the WiFi-based localization system. In testing phase, the RSSI values scanned from nearby APs by a mobile device will be converted to the corre- sponding distance values by PPM model. They will be compared with the training data to find the best matches. The matching method used in this work is KNN. In KNN, prediction for a new instance is based on its nearest neigh- bors in the training data. There are three main ingredi- ents associated with this method, those are (1) the similar- ity measure (the distance measurement) between the query patterns and training data; (2) the number of neighbors to be taken in the prediction; (3) the weight of the neigh- bors; Euclidean and Manhattan distances are two common geometric measures, in which Euclidean is the most used in WiFi-based localization system [16, 17]. In this work, KNN method is evaluated by Euclidean measure. In the proposed radio map, each RP is represented by vector di(t),[d1i (t), .., d L i (t)] T in L dimensional space. In learning phase, all these training data D with their de- pendent variables are stored. In this case, the dependent variables are equivalent to the positions pi of RPs in the environment. In prediction, for a new query pattern z and for each instance d in D, the similarity between d and z is 138 Informatica 41 (2017) 133–148 T.T.T. Pham et al. computed by Euclidean distance measure: l(d, z) = √√√√ n∑ i=1 (di − zi)2 (11) A set NB(z) of the nearest neighbors of z with |NB(z)| = k is also determined and then the estimated location for z is calculated. To find out an optimal k, we test on the empirical data with k in the range from 1 to 200 by an error function (12) for each k. Ek = √√√√ n∑ i=1 ( ŷ − y y )2 (12) where ŷ is the estimated position and y is true position. Finally, the predicted location of z is calculated by the weighted sum of the k neighbors (13). yz = ∑ d∈NB(z) w(d, z)× yd∑ d∈NB(z) w(d, z) (13) where w shows the weights that are chosen by (14). w(d, z) = e−θ×l(d,z) × e−λ×|ti−t0| (14) where θ and λ are constants used to define the curve of ex- ponential functions; t0 belongs to the time a query instance is captured and ti is the time of WiFi signal scanning at each corresponding RP in training phase; l(d, z) is the dis- similarity between a query instance and the its neighbor. In Equation 14, beside the weight based on dissimilarity θ a new coefficient of λ is proposed to reflect the chronologi- cal changes of fingerprinting data in the environment. This means the recently-updated fingerprinting data with query instance will have higher weight than the older one. 5 Proposed fusion method In order to improve the performance of person tracking in camera networks, for each camera FOV, person’s locations determined by WiFi system are optimally assigned with positioning results from camera system. This allows to not only maintain the high accuracy of vision-based per- son localization, but also improve the performance of per- son tracking in camera networks by assigning clearer ID of WiFi adapter to each position determined by camera sys- tem. Algorithm 1 shows the combined method of WiFi and camera system for people localization and identification. At time t, on the 2D floor map, a set of position observa- tions from WiFi system (zwi,t) or camera system (zcj,t) for multiple targets are shown. Index i designates one among N targets located by WiFi system, and index j refers to one of M positions observed by camera system. We consider recursively two consecutive observations of the localiza- tion results from any available sensors. At time t, assuming that we have a set of location observations coming from WiFi system for N targets, with zwi,t = (Xwi,t, Y wi,t, IDwi,t). If at previous time step (t-1) we get the observations zcj,t−1 = (Xcj,t−1, Y cj,t−1) for M positions from camera system. Without loss of generality, we can consider these observations as the state estimations at time t-1. The pre- diction step of the Kalman filter (KalmanPrediction) will be applied to estimate the next state xcj,t based on zcj,t−1. An assignment algorithm is then utilized to find out optimal matchings between the estimated states xcj,t from camera system with observations (zwi,t) from the WiFi system. Considering the result Ki,t of the assignment is the observations at the current time t, then the predicted state xt will be corrected by KalmanCorrection step, by which WiFi-based positions will be augmented with the vision-based positions. 5.1 Kalman filter In the proposed fusion algorithm, the step of state predic- tion in Kalman filter is used to estimate the process state at a certain time based on the position observation or mea- surement obtained from the previous time. The correction step of Kalman filter is done after doing optimal assign- ment between the estimated states and the observations at a certain time. In this case, a process state need to be esti- mated at a certain time is defined as a position pt of a per- son in the real world coordinate system of 2D floor map. It is presented by a state vector xt of location coordinates pXt and pYt on 2D floor map, together with their corresponding velocity values vXt and vYt: xt = (pXt, pYt, vXt, vYt) (15) A position observation zt is then defined as follows: zt = (mXt,mYt) (16) By assumption of constant velocity and acceleration in movement of people, and the position is measured n times per second, the state equations are then defined as follows: pXt = pXt−1 + vXt−1∆T (17) pYt = pYt−1 + vYt−1∆T (18) vXt = vXt−1 (19) vYt = vYt−1 (20) where ∆T = 1n . The state transition matrix A and the state-measurement matrix H are then defined as: A =  1 0 ∆T 0 0 1 0 ∆T 0 0 1 0 0 0 0 1  , H = [1 0 0 00 1 0 0 ] Kalman-based tracking will be started after the first suc- cessful calculated position from WiFi or camera system, Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 139 Algorithm 1: Person tracking by fusion of position observations from WiFi and camera systems. Input: position observations z from WiFi and camera localization systems Output: position estimations x 1 Parameters initiation: A, H, P1, Q, R; 2 for each set of position observations z do 3 if zi,t is from WiFi location system [zwi,t = (Xwi,t, Y wi,t, IDwi,t)] then 4 if zi,t−1 is from camera location system [zcj,t−1 = (Xci,t−1, Y ci,t−1)] then 5 [xcj,t,Pt] = KalmanPrediction(A,Q,zcj,t−1,Pt−1); 6 Ki,t = Assignment(xcj,t, zwi,t); 7 [xwi,t,Pt] = KalmanCorrection(H,R,Ki,t, xt,Pt); 8 Save xwi,t as a state estimation at time t; 9 end 10 else 11 [zcj,t = (Xcj,t, Y cj,t)] 12 if zi,t−1 is from WiFi localization system [zwi,t−1 = (Xwi,t−1, Y wi,t−1, IDwi,t−1)] then 13 [xwi,t,Pt] = KalmanPrediction(A,Q,zwi,t−1,Pt−1); 14 Ki,t = Assignment(xwi,t,zcj,t); 15 [xwi,t,Pt] = KalmanCorrection(H,R,Ki,t,xt,Pt); 16 Save xwi,t as a state estimation at time t; 17 end 18 end 19 end 20 return xwi,t; with the initial state vector x1. The initial covariance ma- trix P1 for the initial state is: P1 =  σ2x1 0 0 0 0 σ2y1 0 0 0 0 σ2vx1 0 0 0 0 σ2vy1  The state noise covariance matrix Q and the measurement noise covariance matrix R are defined as: Q =  σ2pX 0 0 0 0 σ2pY 0 0 0 0 σ2vX 0 0 0 0 σ2vY  ,R = [σ2mX 00 σ2mY ] where σ2 denotes deviation in centimeter from real values of each quantity. The measurement noise refers to the noise of calculated positions from WiFi or camera system, and the state noise is defined according to the motion of people. The initial covariance matrix P1 for the initial state x1, with assumption that the calculated position has the deviation of ±5cm from real position in both X and Y directions, and the velocity has the deviation of ±3cm. Similarly, the state noise covariance matrix Q is set with standard deviations of ±5cm and±3cm for the determined position and its veloc- ity, respectively. The measurement noise covariance matrix R is described with the standard deviation of 3cm for Foot- Point measurement in X and Y directions, and ∆T is set to 1, meaning that the position is measured every second. 5.2 Optimal assignment After the Kalman prediction step, we have a position esti- mation of xcj,t or xwi,t for camera or WiFi system, respec- tively. Considering the first case of position estimation xcj,t at time t for camera system, it is estimated from the previ- ous observation of vision-based location zj,t−1. Then, op- timal assignment at time t between xcj,t and zwi,t is applied. Assuming that the assignment of an estimated position xj and an observation zi incurs a cost dij which is the Eu- clidean distance between them, then the matrix DN×L of the costs or distances between every x ∈ M and z ∈ N is then defined as: D =  d11 d12 ... d1N d21 d22 ... d2N ... ... ... ... dM1 dM2 ... dMN  where dij = √ (Xcj −Xwi )2 + (Y cj − Y wi )2. The assign- ment is now formulated as a linear assignment problem: min ∑ i∈N ∑ j∈M dijxij (21) subject to ∑ i∈N xij = 1 ∀j ∈M∑ j∈M xij = 1 ∀i ∈ N xij ≥ 0 ∀i ∈ N , j ∈M 140 Informatica 41 (2017) 133–148 T.T.T. Pham et al. This optimal assignment is done with the following con- straints: – If N = M , for each pair of (xcj,t, zwi,t), we augment the position xcj,t with the identity IDwi,t from zwi,t; – If N > M , all unassigned zwi,t will be kept up with their original coordinates which are computed from WiFi-based localization system; – If N < M , all unassigned xcj,t are considered as false positives and will be discarded, because we assume in the surveillance system that all people coming in the monitoring areas hold WiFi-enabled devices and they have checked in at the entrance. The overall formula for these constraints is given as fol- lows: Ki,t = { (Xcj,t, Y c j,t, ID w i,t) if zwi,t is assigned; (Xwi,t, Y w i,t, ID w i,t) otherwise. where Ki,t denotes the association between position esti- mations xcj,t and observations zwi,t. Each component Ki,t is a random variable that takes its value among {0, .., N}. Based on this association, the location information from WiFi-based observations will be corrected according to the positions given by the camera system, and the correspond- ing ID from the WiFi system will be assigned. The cor- rection step of the Kalman filter is applied to update the predicted state by the current position observation Ki,t. The same procedure is done for the case in which WiFi- based location observations come before camera-based ones, and we have optimal assignment of an estimated po- sition xi from the WiFi system and an observation zj from the camera system. 6 Dataset and evaluation 6.1 Testing dataset In order to evaluate the combined algorithm for person tracking using both WiFi and camera systems, a multi- modal dataset with two scripts are constructed in this work. Script 1 is set with simpler scenarios than Script 2. Two people are involved in Script 1, with their random routes of moving through two non-overlapping cameras. Some inter-person occlusions appeared but not as frequently as in Script 2. The visual data in Script 1 is used for per- son localization and Re-ID based on camera. Script 2 con- tains five scenarios referring to different number of people taking part in each scenario: one person, two, three, and five moving people. The data in Script 2 is very challeng- ing for both WiFi-based and vision-based systems. People move through four different cameras. Severe occlusions happened because all people are required to move in close proximity with a fixed route (see Figure 8). Moreover, the similar human appearance is a challenge for visual process- ing problems. Figure 8: A 2D floor map of the testing environment in Figure 9, with the routing path of moving people in testing scenarios. Figure 9: Testing environment. The testing environment for building the dataset is shown in Figure 9, with 6 access points (APs) and 4 cam- eras are deployed in the environment. The APs are set to a same SSID, which assures continuous connectivity for mo- bile devices when people move from the range of one AP to another. The WiFi range for each AP is about 30–50 meters in radius, depending on walls and obstacles in the environment. The AP specifications are MAC address, AP position in X , Y and Z. All APs used in the testing are Linksys E1200 devices. A person holds a WiFi-enable de- vice and moves in the testing environment, with a normal velocity of 1–1.3m/s. The time duration for each scenario is from 3 to 5 min- utes, with about 400 RSSI values are acquired from 6 APs and average time deviation between two consecutive sam- ples is 2 seconds. The mobile devices and cameras are time synchronized to Internet time. This makes a synchroniza- tion of data captured from both camera and WiFi. Basing on this, we can compute real-world positions of a mobile user on the 2D floor map at each time. The time stamp for each person location calculated from camera or WiFi system will provide the basis for processing multi-model object localization. The WiFi data is scanned from the mo- bile devices and stored in XML files. These devices con- Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 141 Frame 491Frame 135 Frame 1596Frame 1145 Frame 905Frame 431 Frame 1969Frame 957 Frame 692Frame 242 Frame 1541Frame 1328 Frame 784Frame 313 Frame 2114Frame 810 Figure 10: The visual examples in Script 2. The first row contains frames for the scenario of one moving person. The scenarios for 2, 3 and 5 moving people are shown in the second, third and fourth rows. tinuously capture the signals from available APs in the en- vironment. The AP specifications are saved as a record of scanning time, MAC address, AP name, and RSSI. The APs are distinguished by their own MAC addresses. For visual data, we manually assign FootPoint positions on the captured frames with the corresponding time stamps and IDs. These positions are then automatically trans- formed into 2D locations on the floor map by using camera calibration and homography matrix. The person ID which is assigned in visual data is equivalent to the ID of WiFi adapter by predefined convention. In short, for each sce- nario, the ground truth data is achieved and saved as XML files which contain the following records: – Frame number. – Person ID. – Coordinates of top left and bottom right positions of the bounding box containing the person. – The image coordinates of FootPoint position. – The corresponding coordinates of FootPoint positions on 2D floor map. In case of no person detected, except frame number, all other records are set to -1. Figure 10 illustrates examples in Script 2. The frames in the first row show the scenario of one moving person, while those in the second, third and fourth rows are frames for the scenarios of two, three and five moving people. For WiFi data which is determined outside camera FOV, the ground truth of person locations in these regions are calculated by a pedestrian foot counting program. It takes input information from the acceleration and direction sen- sors that are available on smart phones or tablets [20]. Ba- sically, the positions of mobile user in this region are com- puted by the route length that user passes through marking points or reference points. This distance is calculated by foot counter with the average length of the foot step of each particular person is considered. The foot counter gives the positioning result of 5m with the deviation of 3m for the route length of 120m. In our test, the route length outside camera view is only about 10m. In addition, the bias for foot counter is accumulated from time to time, so in 10m this deviation will be 0.8m (equivalent to 8% of the route length). This makes the deviation of 8cm per one meter la- beled in the dataset in comparison with the truth positions. After the step of synchronization between WiFi and vi- 142 Informatica 41 (2017) 133–148 T.T.T. Pham et al. sual data, the interpolation method is applied to calculate the person positions that are outside the camera field of views. 6.2 Evaluation metrics In order to evaluate the performance of vision-based tracking, the metrics of Multi Object Tracking Precision (MOTP) [18], Global Multiple Object Tracking Accuracy (GMOTA) [19], and CMC (Cumulative Match Curve) are utilized. Assuming that for each time step t, a multi-person tracker outputs a set of hypotheses {h1, .., hm} for a set of visible people {u1, .., un}. MOTP measures the posi- tioning error for all matched pairs of person and tracker hypothesis on all frames. This metric is defined by: MOTP = ∑ i,t di,t∑ t ct (22) where di,t is Euclidean distance between ground truth and tracker hypothesis values for the person ith at time frame t. In this work, it is Euclidean distance between ground truth and tracker hypothesis of FootPoint positions. The element ct indicates the number of matched pairs at time step t. GMOTA is an extension of MOTA (Multiple Object Tracking Accuracy) [18]. MOTA measures the number of errors the tracker made in terms of false negatives (missed detections), false positives (wrong detections), mismatches and failure to recover tracks. This score is computed as follows: MOTA = 1− ∑ t(FNt + FPt + IDt)∑ t gt (23) where FNt is false negatives, FPt is false positive, IDt shows the number of instantaneous identity switches, and gt denote the number of ground truth detections at time frame t. In GMOTA score, the IDt is replaced by global IDt (gIDt). This means that gIDt presents the perfor- mance of the tracker in preservation of person identity as- signments in a global manner instead of instantaneous iden- tity assignments of MOTA. GMOTA = 1− ∑ t(FNt + FPt + gIDt)∑ t gt (24) The CMC is employed as the performance evaluation metric for vision-based person Re-ID. The CMC curve presents the expectation of finding correct match in the top n matches. The accuracy of the WiFi-based localization system is evaluated by the statistical values of maximal error, error average, and error at reliability of 90%. Maximal error is the maximum distance deviation in meter between the positions determined by the system and the ground truth positions. The error average refer to the average distance deviation in meter between the positions determined by the system and the ground truth positions. Error at reliability of 90% indicates the distance deviation value in meter in which 90% of the testing times are smaller than this value. The performance of fusion method is evaluated in this work by the metric of GMOTA. 6.3 Experimental results In vision-based person localization, at each camera FOV, person identification is done by a so-called process of iden- tification by tracking. This means a trajectory which be- longs to an individual in the current frame is linked to the corresponding one from the previous frame based on an optimal assignment of Euclidean distances between them. However, this results in ID switches when people switch to each others. The proposed method in 3.2 for person Re-ID helps to solve not only person identification in each camera FOV, but also person Re-ID among multiple cameras by using a robust appearance-based descriptor built on each detected human ROI at each FootPoint position. This allows to per- form tracking by identification. However, person identifi- cation and Re-ID performance still need to be improved, especially in case of inter-person occlusions and people have similar appearances. The proposed fusion algorithm allows adding clearer ID information of WiFi adapter for performing tracking by identification. In the following sections, the testing results for WiFi- based localization, vision-based localization and Re-ID, fusion-based tracking are shown. 6.3.1 WiFi-based localization results The system parameters of the WiFi-based localization model are calculated, then based on these, the positioning results are given out. Firstly, the training process using GA algorithm is set up with the configuration provided in Table 1. Using these data, the optimal parameters are produced as in Table 2. Parameter Value Parameter Value Population size 20 Tolerance 10−6 Elite count 5 Selection Uniform Crossover fraction 0.5 Crossover Scattered Time limit No Mutation Uniform Maximal generations No Creation population Uniform Table 1: Genetic algorithm configuration. Parameter Values for the first scenario Values for the second scenario P0 -41 dBm -36.1757 dBm n 1.1 2.2029 kσ 1.0035 m−1 5.3147 m−1 r0 5 m 2.5117 m kd 49.23 dBm.m−1 5.1311 dBm.m−1 Table 2: Optimized system parameters for the first and the second scenarios of testing environments. Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 143 Fingerprint Feature Maximal error (m) Average error (m) Error at reliability of 90% (m) RSSI 6.3 1.86 2.99 Distance 6.27 1.89 2.98 Table 3: Evaluations for distance and RSSI features in case of using coefficient λ. Fingerprint Feature Maximal error (m) Average error (m) Error at reliability of 90% (m) RSSI 6.06 1.76 3.55 Distance 6.5 1.59 2.9 Table 4: Localization results using different features of dis- tance and RSSI, without using coefficient λ. Secondly, the weights of different values of θ based on dissimilarity are given out (see Figure 11). Different values of λ are presented in Figure 12, with λ = 0.5 × 10−6, the influence is reduced by 3 when fingerprints is scanned from 1 month since the testing time (roughly 2.6×106 seconds). Similarly, when fingerprints is taken from 2 months since the testing time, the influence takes only 10% compared with that of new fingerprints. In this work, we choose k = 9, θ = 1.1 and λ = 2× 10−6. The radio maps and fingerprint locations in the testing environment are shown in Figure 13a and Figure 13b. The regions with deep pink color indicate that more APs are available than the regions with light pink color. The localization experiments are conducted by using fin- gerprinting method with distance features calculated by the proposed probabilistic propagation model. The com- parative results are also given out for using fingerprinting method with RSSI features. Additionally, the stability and reliability of radio map with distance features is also con- firmed by the evaluations with coefficient λ. Figures 14, 15, 16 show the comparative results when the coefficient λ is taken into account. The localization re- sults, distribution of the localization results compared to the real locations, and the reliability of the localization re- sult as a function of the localization error are shown corre- spondingly in these figures. The details for these results are shown in Table 3. It can be seen from the experiments that the positioning errors at reliability of 90% when using distance features are a little bit higher than using RSSI fea- tures. However, without using λ, the localization reliability for RSSI features decreases, whilst it is stable for distance features. The results for this are shown in Figures 17, 18, 21, and in Table 4, with the error at reliability of 90% is 3.55m for RSSI features, but it is 2.9m for distance fea- tures. The above experiments show that using distance fea- tures for fingerprint data will result in more stable and re- liable radio maps in comparison with using RSSI features. Moreover, this also brings lower cost for updating finger- print data, which is considered as one of the most challeng- ing problem of fingerprinting method in WiFi-based local- ization. 0 2 4 6 8 10 12 14 16 18 20 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Dissimilarity (m) W ei gh t theta=0.5 theta=0.7 theta=0.9 theta=1.1 theta=1.3 Figure 11: Weights of different values of θ based on dis- similarity. 0 2 4 6 8 10 12 14 x 10 6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Time (s) W ei gh t lambda=1/1000000 lambda=1/2000000 lambda=1/4000000 lambda=1/6000000 Figure 12: Weights of different values of λ based on dis- similarity. (a) (b) Figure 13: (a) the radio map, with (b) 2000 fingerprint lo- cations collected in the testing environment. 144 Informatica 41 (2017) 133–148 T.T.T. Pham et al. Vision-based evaluations The proposed fusion algorithm Hallway (Cam 1) Showroom (Cam 3) Hallway (Cam 1) Showroom (Cam 3) MOTP (cm) 24.3 21.3 24.3 21.3 FN (%) 17.1 26.4 7.6 12.6 FP (%) 22.7 18.3 3.4 2.1 gID 28.3 11.6 4.9 2.3 GMOTA (%) 31.2 52.6 83.9 85.7 Table 5: The comparative results of the proposed fusion algorithm against the vision-based evaluations on testing data of Script 1. 0 10 20 30 40 50 60 0 10 20 30 40 50 60 X (m) Y ( m ) Ground Truth path Localization result with distance feature Localization result with RSSI feature Figure 14: Localization results with distance and RSSI fea- tures when using coefficient λ. 6.3.2 Experimental results for vision and fusion-based tracking. The performance of vision-based person localization and Re-ID is evaluated on Script 1 and Script 2 databases. In addition, the comparative results gained from fusion sys- tem of camera and WiFi are also indicated on these. Firstly, vision-based person Re-ID evaluations are done on Script 1 data. The human ROIs are manually extracted from the frames captured by three non-overlapping cam- eras: Cam 1 (hallway), Cam 2 (lobby) and Cam 3 (show- room). The human ROIs from Cam 2 are used for training phase (see Figure 19) and the human ROIs from Cam 1 and Cam 3 for testing phase (see Figure 20). We train the system with totally 10 people, including two testing ones, by the images of human ROI extracted from Cam 2. Figure 22 shows person recognition rates for this experiment, with Rank 1 is 51.1%. Table 5 shows the results for vision-based localization, with two scenarios of Hallway (Cam 1) and Showroom (Cam 3) are considered. MOTP evaluated on the vision- based localization system with 24.3cm and 21.3cm for Hallway and Showroom scenarios respectively. These val- ues are retained for the fusion model of camera and WiFi. X (m) -6 -3 0 3 6 -9 -6 -3 0 3 Y ( m ) Distribution error with distance feature Distribution error with RSSI feature Figure 15: Distribution of localization error for distance and RSSI features when using coefficient λ. GMOTA ratio for Hallway is better than Showroom, with correspondingly 31.2% compared to 52%. However, by be- ing integrated with WiFi, these values increase incredibly to 83.9% for Hallway and 85.7% for Showroom. This re- sulted from the sharply decreases in the rates of FN, FP and gID in both scenarios. Additionally, in comparison with the perfect case of manual human detection in vision-based Re- ID, the performance of person tracking by identification is not as good as the results from the proposed fusion algo- rithm. Secondly, further evaluations for the proposed fusion al- gorithm, the experiments are done on the data of Script 2. This dataset is very challenging compared to Script 1, be- cause of severe occlusions and the similarity in human ap- pearance. Moreover, people moving together in the same route is also a challenge for WiFi-based localization. In the experiments with this data, we use the ground truth data of FootPoint positions and the corresponding hu- man ROIs for testing evaluations. The parameter gID in GMOTA metric now indicates the performance of tracker in maintaining the person ID when he/she moves from one camera FOV to others or re-appears in one camera FOV. Table 6 shows the comparative results of GMOTA when applying the fusion algorithm and Rank 1 for person Re- ID. It should be noted that FN and FP are not included Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 145 0 1 2 3 4 5 6 7 8 0 10 20 30 40 50 60 70 80 90 100 Error (m) R el ia bi lit y di st rib ut io n (% ) Localization reliability with distance feature Localization reliability with RSSI feature Figure 16: Localization reliability for distance and RSSI features when using coefficient λ. 0 10 20 30 40 50 60 0 10 20 30 40 50 60 X (m) Y ( m ) Ground Truth path RSSI Distance Figure 17: Localization results for distance and RSSI fea- tures, without using coefficient λ. in the testing evaluations of GMOTA because we use the ground truth data of FootPoint positions and human ROIs. In this case, only gID is taken into account. This means performance of maintenance person ID in tracking now de- pends only on the performance of WiFi-based person local- ization. In comparison with GMOTA values from Script 1, GMOTA figures from Script 2 are much lower. It is only 31.7% for the scenario of two moving people, 16.5% and 11.2% for scenarios of three and five moving people, re- spectively. This can be explained that data of Script 2 is much challenging than Script 1. People moving together in very close proximity is not only a burden for vision-based person localization and identification, but also for WiFi- based person localization because of noisy WiFi data when people are close to each other. However, in comparison with person Re-ID by kernel descriptor, these results are much higher. In this experi- ments, besides the number of testing people, we train the system with 20 other people at check-in gate for person Re-ID. The recognition rate at Rank 1 is only 12.6% for scenario of two moving people, which is 19.1% lower than -9 -6 -3 0 3 6 -9 -6 -3 0 3 6 9 X (m) Y ( m ) RSSI Distance Figure 18: Distribution of localization error for distance and RSSI features, without using coefficient λ. Two people Three people Five people GMOTA (%) 31.7 16.5 11.2 Rank 1 (%) 12.6 8.9 5.6 Table 6: The experimental results for person tracking and person Re-ID with Script 2 dataset. fusion-based method. Rank 1 figures for scenarios of three and five moving people is 8.9% and 5.6%. Clearly, perfor- mance of person Re-ID based on kernel descriptor will be degraded in case of the similar human appearance. From the above comparative evaluations, we can see that by using the proposed fusion algorithm, the performance of person tracking by identification and person Re-ID is im- proved significantly. The vision-based person localization with high accuracy, together with the clear ID information from WiFi-enable device are integrated into each detected FootPoint position. This allows to do tracking by identifi- cation at each camera FOV, and based on this, the person Re-ID in non-overlapping camera networks can be solved more effectively than applying only vision-based method. 7 Conclusion In this work, person localization and Re-ID in surveillance regions covered by WiFi signals and disjointed FOV cam- eras are improved by a fusion algorithm based on Kalman filter and optimal assignment technique. This algorithm is executed with the position observations on 2D floor map achieved from each single system of camera or WiFi. Evaluation on the multimodal dataset shows outperform- ing results when the proposed fusion algorithm is applied. The high positioning accuracy of vision-based system is maintained in multimodal person localization system. Ad- 146 Informatica 41 (2017) 133–148 T.T.T. Pham et al. Figure 19: Training examples of manually-extracted human ROIs from Cam 2 for person 1 (images on the left) and person 2 (images on the right). (a) (b) Figure 20: Testing examples of manually-extracted human ROIs from Cam 1 (images on the left column) and Cam 3 (images on the right column) for (a) person 1 and (b) person 2. Improvement of Person Tracking Accuracy in. . . Informatica 41 (2017) 133–148 147 0 1 2 3 4 5 6 0 10 20 30 40 50 60 70 80 90 100 Error (m) R el ia bi lit y di st rib ut io n (% ) RSSI Distance Figure 21: Localization reliability for distance and RSSI features, without using coefficient λ. 1 2 3 4 5 6 7 8 9 10 10 20 30 40 50 60 70 80 90 100 Rank R ec og ni tio n R at e Figure 22: Person Re-ID evaluations on testing data of two moving people. ditionally, the fusion algorithm allows tracking by identifi- cation and based on this person Re-ID in non-overlapping cameras is done with clear identity information taken from the WiFi-based system. In the future works, some other localization techniques, such as RFID or UWB, can be integrated into a multi- modal system in order to improve the positioning accuracy and person Re-ID. The fusion algorithm for person local- ization and Re-ID is also correspondingly broaden to adapt this addition. Acknowledgement This research is funded by the Vietnam National Founda- tion for Science and Technology Development (NAFOS- TED) under grant number 102.04-2013.32. References [1] Van den Berghe, Sam and Weyn, Maarten and Spruyt, Vincent and Ledda, Alessandro (2011) Combining wireless and visual tracking for an indoor environ- ment, International Conference on Indoor Position- ing and Indoor Navigation (IPIN-2011). [2] MIYAKI, Takashi, YAMASAKI, Toshihiko, et AIZAWA, Kiyoharu (2007) Visual tracking of pedes- trians jointly using wi-fi location system on dis- tributed camera network, 2007 IEEE International Conference on Multimedia and Expo, IEEE, 2007. p. 1762–1765. [3] Rekimoto, Jun and Shionozaki, Atsushi and Sueyoshi, Takahiko and Miyaki, Takashi (2006) PlaceEngine: a WiFi location platform based on realworld folksonomy Internet conference, p. 95–104. [4] Cheng, Yu-Chung and Chawathe, Yatin and LaMarca, Anthony and Krumm, John (2005) Accuracy charac- terization for metropolitan-scale Wi-Fi localization, Proceedings of the 3rd international conference on Mobile systems, applications, and services, ACM, p. 233–245. [5] Alahi, Alexandre and Haque, Albert and Fei-Fei, Li (2015) RGB-W: When Vision Meets Wireless, Pro- ceedings of the IEEE International Conference on Computer Vision, IEEE, p. 3289–3297. [6] Pham, T. T. T., Le, T. L., Vu, H., and Dao, T. K. (2017) Fully-automated person re-identification in multi-camera surveillance system with a robust ker- nel descriptor and effective shadow removal method, Image and Vision Computing, Elsevier, p. 44-62. [7] Kuhn, Harold W (1955) Naval research logistics quarterly, Wiley Online Library, p. 83–97. [8] Zhang, Zhengyou (2000) A flexible new technique for camera calibration, Pattern Analysis and Machine In- telligence, IEEE, p. 1330–1334. [9] Thi Thanh Thuy Pham, Anh Tuan Pham, Hai Vu (2015) A new technique for linking person trajecto- ries in surveillance camera network, Conference on Fundamental and Applied IT Research (FAIR), p. 8– 15. [10] Bo, Liefeng and Ren, Xiaofeng and Fox, Dieter (2010) Kernel descriptors for visual recognition, Ad- vances in Neural Information Processing Systems (NIPS), Vancouver, Canada, p. 244–252. [11] Dao, Trung-Kien and Pham, Thanh-Thuy and Castelli, Eric (2013) A robust WLAN positioning sys- tem based on probabilistic propagation model, 9th In- ternational Conference on Intelligent Environments (IE), IEEE, p. 24–29. [12] Goldsmith, A. (2005), Wireless communications, Cambridge university press. 148 Informatica 41 (2017) 133–148 T.T.T. Pham et al. [13] Roberts B. and Pahlavan K. (2009) Site-specific rss signature modeling for wifi localization, In Global Telecommunications Conference, IEEE, p. 1–6. [14] Munoz D., Lara F.B., Vargas C., and Enriquez- Caldera R. (2009), Position location techniques and applications, Academic Press. [15] Haupt, Randy L and Haupt, Sue Ellen (2004) Practi- cal genetic algorithms, John Wiley & Sons. [16] Jungmin So, Joo-Yub Lee, Cheal-Hwan Yoon, Hyun- jae Park (2013) An Improved Location Estimation Method for Wifi Fingerprint-based Indoor Localiza- tion, International Journal of Software Engineering and Its Applications. [17] Arsham Farshad, Jiwei Li, Mahesh K. Marina, Fran- cisco J. Garcia (2013) A Microscopic Look at WiFi Fingerprinting for Indoor Mobile Phone Localization in Diverse Environments, International Conference on Indoor Positioning and Indoor Navigation. [18] Bernardin, Keni and Stiefelhagen, Rainer (2008) Evaluating multiple object tracking performance: the CLEAR MOT metrics, EURASIP Journal on Image and Video Processing, Springer, p. 1–10. [19] Ben Shitrit, Horesh and Berclaz, Jerome and Fleuret, François and Fua, Pascal (2013) Tracklet-based Multi-Commodity Network Flow for Tracking Mul- tiple People, No. EPFL-PATENT-186751, WO. [20] Kothari, Nisarg and Kannan, Balajee and Glasgwow, Evan D and Dias, M Bernardine (2012) Robust indoor localization on a commercial smart phone, Procedia computer science, Elsevier, p. 1114–1120. Informatica 41 (2017) 149–158 149 Persons-In-Places: a Deep Features Based Approach for Searching a Specific Person in a Specific Location Vinh-Tiep Nguyen, Thanh Duc Ngo, Minh-Triet Tran, Duy-Dinh Le and Duc Anh Duong University of Information Technology, University of Science E-mail: {tiepnv, thanhnd}@uit.edu.vn, tmtriet@fit.hcmus.edu.vn, {duyld,ducda}@uit.edu.vn Keywords: video instance search, deep neural network, location search, person search Received: March 29, 2017 Video retrieval is a challenging task in computer vision, especially with complex queries. In this paper, we consider a new type of complex query which simultaneously covers person and location information. The aim to search a specific person in a specific location. Bag-Of-Visual-Words (BOW) is widely known as an effective model for presenting rich-textured objects and scenes of places. Meanwhile, deep features are powerful for faces. Based on such state-of-the-art approaches, we introduce a framework to leverage BOW model and deep features for person-place video retrieval. First, we propose to use a linear kernel classifier instead of usingL2 distance to estimate the similarity of faces, given faces are represented by deep features. Second, scene tracking is employed to deal with the cases face of the query person is not detected. Third, we evaluate several strategies for fusing individual person search and location search results. Experiments were conducted on standard benchmark dataset (TRECVID Instance Search 2016) with more than 300 GB in storage and 464 hours in duration. Povzetek: V prispevku je opisana metoda povpraševanja po osebi in lokaciji iz video vsebin. 1 Introduction With the rapid growth of video recording devices, many videos from diverse domains such as professional or ama- teur film making, surveillance and home recording are be- ing created. These vast video collections are being shared on video broadcasting sites (e.g., YouTube). One of the most fundamental needs is to help users find exactly what they are looking for in video databases. To search di- rectly on videos, we consider an approach—visual instance search on video databases. The term instance search (INS) is defined formally by TRECVID [13]: finding video seg- ments of certain specific object, place or person, given vi- sual examples from a video collection. There are varieties of query types including rich-textured, fairly-textured or deformable object. These make instance search is a very challenge task since we do not know any prior information about the query. The objective of this problem is to find the person and the location in a large-scale video dataset. This type of query is important since person and location are two most popular query objects. It has many applications in prac- tice such as: surveillance systems, personal video archive management. This query topic is also a very hard topic because there are many variations in size, light condition, view change. Figure 1 gives an example of this type of query. Images in the first row are examples of a pub that a user want to search. These images cover multiple views of a location with many irrelevant or noisy objects such as humans, decorations. These objects may cause low re- trieval accuracy due to noisy features. Images in the second row are examples of the person that the user also needs to find if he appears at the pub. Persons are special query ob- jects because they are 3D object with multiple views and deformable with different cloths texture features. All of these make our retrieval task with this compound query to be more challenging. A very natural approach is to combine the scores of rec- ognizing face and location. There are some challenges in this approach: – The scores are independent and incomparable. It makes typical fusion techniques such as average fu- sion inefficient. – Frames with very clear and recognizable faces often have large proportions in appearance but less infor- mation about the context scene. Hence, frames which have higher score in recognizing a face may have lower score or low rank for a location, and vice versa. This gives the low performance when simply combin- ing these scores. – In a video scene that contains a person and a location, both of them are not always shown perfectly: the per- son may change their head pose in multiple directions while the location may change points of view by the time. However, query examples do not cover all views of target objects. Most state-of-the-art object instance retrieval systems are based on bottom-up approach with a very well-known 150 Informatica 41 (2017) 149–158 V.-T. Nguyen et al. Figure 1: A query topic includes location examples (first row images) and person examples (second row images) marked by magenta boundaries. model Bag-of-Visual-Words (BOW) [23] which benefits from powerful local descriptors for matching textures, then checks the geometric consistency to further improve the ac- curacy. This approach relies on the key assumption that two similar objects share significant number of local patches that can be matched against each other. When searching on rich-textured instances which con- tain enough discriminative texture patterns (e.g. locations, buildings, book covers, paintings, etc.), there are some am- biguous patches that share similar shapes with the query instance but belong to an irrelevant object. However, ratio of these patches is low, thus the similarity scores of images containing correct instance are higher than incorrect ones. Moreover, its extensions e.g. geometric consistency check- ing [16][30], query expansion [8][7][1] also further signif- icantly improve the performance of the searching system. When searching on highly flexible appearance object such as human, the performance is still very low due to the limited capacity of representation of the BOW model. For the first video segment that the query person appears, the problem is equivalent to face recognition without using other information such as cloths texture feature. From that segment to the end of a scene, people are likely to be in the same place even his/her face disappears. In this paper, we propose a system which leverages both BOW and Convolu- tional Neural Network (CNN) based feature for retrieving this new type of query. For location search, we combine BOW based and CNN based features to improve the per- formance. For person search, we use VGG-face feature for recognizing the first video shot that the target person ap- pears. In stead of using distance metric such L2, we pro- pose to use a linear kernel method to learn high-level fea- ture encoded by a deep CNN. Finally, in order to boost the recall of the system, we implement scene tracking to keep track shots following the high response ones. The rest of this paper is organized as follows. Section 2 presents related work. Details of our instance search frame- work is presented in Section 3. Section 4 presents our ex- periment results on TRECVID dataset. Finally, Section 5 concludes the paper. 2 Related work To improve the performance of INS systems, multiple tech- niques have been proposed, such as rootSIFT feature [1], large vocabulary [16], soft assignment[17]. Among them, spatial verification is one of the most effective approaches, and also serves as the prerequisite step for other advanced techniques such as query expansion. Spatial verification can be classified into two categories: spatial reranking [16] [30] [33] and spatial ranking [10] [5] [21]. These ap- proaches work very well on big and rich-textured object such as location. To further improve the performance, Wan et al. ex- plore deep learning techniques with application to instance search task[31]. They show that deep learning feature from CNN model pre-trained on large-scale dataset can be used for representing image or object in new instance search task. Moreover, by retraining the deep models on the new domain, the retrieval performance could be boosted signif- icantly. Although the amount of training data is only a few examples per query object, pre-trained network with pa- rameters learned from previous large-scale dataset makes fast convergence on new data domain. In addition to retrain the CNN network, Babenko et al. also investigate the performance of compressed deep fea- tures, where plain PCA or a combination of PCA with dis- criminative dimensionality reduction result in very short codes with state-of-the-art performance [4]. They explain that passing an image through the network discards much of the information that is irrelevant for classification (and for retrieval). Thus, CNN based neural codes from deeper layers retain less (useless) information than unsupervised aggregation-based representations. Therefore PCA com- Persons-In-Places: a Deep Features Based Approach. . . Informatica 41 (2017) 149–158 151 pression works better for neural codes. Beside deep en- coding technique, the authors also introduces and evalu- ates a new simple and compact global image descriptor and investigates the reasons underlying its success [3]. They show that, feature aggregation using sum-pooling tech- nique outperform when using max-pooling on deep fea- tures from fully connected layers [18], VLAD[2], demo- cratic aggregation[11] which successfully applied on SIFT feature. Another problem this paper focuses on is face recogni- tion in images and videos. We classify many methods pro- posed in the literature into two groups: the ones that do not use deep learning and the ones that do. For the first group (also named “shallow" methods), they start by extracting a representation of the face image using hand-crafted local image descriptors such as SIFT, LBP, HOG [9][12][32]; then they aggregate such local descriptors into an overall face descriptor by using a pooling mechanism, for example the Fisher Vector [14][22]. This work is concerned mainly with deep architectures which currently reach the state-of-the-art performance. The idea of such methods is to use a CNN feature extrac- tor with parameters learned by composing several linear and non-linear operators. One of the representative meth- ods for this approach is DeepFace [28]. This method uses a deep CNN trained to classify faces using a dataset of 4 million examples of 4000 persons. The goal of training is to minimize the distance between congruous pairs of faces (i.e. portraying the same identity) and maximize the dis- tance between incongruous pairs, a form of metric learning. The authors later extended this work in [29], by increasing the size of the dataset to 10 million persons and 50 im- ages per person. They proposed a bootstrapping strategy to select identities to train the network and showed that by controlling the dimensionality of the fully connected layer the generalisation of the network can be improved. The DeepId series of papers by Sun et al. [24][26][27][25], extensions of the DeepFace, each of which improves the performance on LFW and YFW incrementally and steadily. A number of new ideas were introduced by incorporating over this series of papers, including: using multiple CNNs [26], a Bayesian learning framework [6] to train a metric, multi-task learning over classification and verification [24], different CNN architectures which branch a fully connected layer after each convolution layer [27], and very deep networks [25]. Compared to DeepFace, DeepID does not use 3D face alignment, but a simpler 2D affine alignment and trains on combination of CelebFaces [26] and WDRef [6]. However, the final model in [25] is quite complicated involving around 200 CNNs. Recently, a research from Google [20] trains a CNN us- ing a massive dataset of 200 million face identities and 800 million image face pairs. Their triplet-based loss considers two congruous (a,b) and a third incongruous face c in com- parison. Differently from other metric learning approaches, their goal is to make a closer to b than c; comparisons are always relative to a pivot face. In training this loss is ap- plied at multiple layers, not just the final one. In this paper, we follow the VGG-Face descriptor net- work [15] which designs a procedure that is able to assem- ble a large-scale dataset, with small label noise, whilst min- imizing the amount of manual annotation involved. They use weaker classifiers to rank the data presented to the an- notators for reranking. They also show that a deep CNN can achieve results comparable to the state-of-the-art with appropriate training without any special techniques. In other to apply in a new task (instance search) and data domain, instead of using the activation of the last layer, we propose to use the feature extracted from one of the fully connected layers with a linear classifier (e.g support vector machine with linear kernel) to train face model for the query person. To further improve the performance of the instance search system, especially in the case that the target person turns his/her back to the camera, we propose to combine person tracking with scene tracking. 3 Proposed framework This section describes our proposed framework and its con- figurations. Our proposed system includes four main mod- ules: BOW based retrieval, location learning for verifica- tion, face learning for recognition and final fusion. Figure 2 sketches out the work flow of main components in our INS system. Given a compound query topic including person and location examples, our goal is to rank video shots con- taining that combination. Each example is a video frame of location or person captured at a specific point of view as shown in Figure 1. In our framework, instead of using all frames of a video shot, we perform key frame extraction at 5 frames per second for saving computational cost. For simplicity of notation, we only consider a set of query examples and key frames of a shot in the video dataset. Other shots are processed similarly. Firstly, for each location example, we extract local features us- ing Hessian-Affine detector and rootSIFT descriptor, then quantize using a codebook trained on video database. In order to reduce the effect of noisy features given by irrel- evant persons, we remove all visual words inside bound- ing boxes detected by a person detector. In this paper, we use Faster RCNN[19] with pre-trained network on PAS- CAL VOC 2007 to find person regions. Each frame of lo- cation is finally represented by a BOW feature vector Lk with tf-idf weighting scheme. For each person example, we only use the information detected by face detector since the target person may change clothes by the time. Each face bounding box is described by a CNN based descriptor and represented by a feature vector Fp. Since location and person examples are independent, we can compute two rank lists independently. However, BOW model could perform in large-scale video data, we use location features to retrieve rank lists as the first step, then use face features for later reranking. Top K retrieved 152 Informatica 41 (2017) 149–158 V.-T. Nguyen et al. Figure 2: Framework overview. shots based on SBOW similarity score are then used for the reranking stage. Note that, BOW model is a non-structured model which does not take into account the spatial rela- tionship between visual words. To remove irrelevant shot, we combine both RANSAC based algorithm and learning based approach for high level feature vector produced by a very deep CNN network VGG-19. The second part of our system is person recognition based reranking. A person example includes a color image and its mask which helps the system to separate interested person from irrelevant objects. In this case, we only focus on face feature since the target person changes the cloths over time. We use a face detector and face descriptor to ex- tract representative feature of the query person. After this stage, each person is represented by a set of deep feature vectors. A typical way to compare face features is using symmetric distance or similarity score. In this approach, each component of a feature vector is processed evenly. However, this vector is a high-level feature which describes many parts of a face. Some of them are important and some are not. Hence, we propose to use a linear classifier to learn the weights of a face feature, then compute the similarity score between the face model and a video shot. Finally, we propose a final fusion step in which, it takes into account all components of the system including: BOW based location search, CNN based irrelevant location re- moval, face based reranking and scene tracking. In a video scene that contains a person and a location, both of them are not always shown perfectly: a person may change their head pose in multiple directions while a location may change points of view by the time. However, query exam- ples of face and location are limited and incomplete. To propagate the score of positive shot, we inherit that value for the next scenes with a multiplication factor. 3.1 Location search In the first stage of the system, we retrieve top K shots that is similar to the location examples using BOW model with local feature. In this paper, we use the state-of-the-art configuration of BOW framework that have been used for image retrieval. Local features of each key frame of a shot are extracted using Hessian-affine detector and rootSIFT feature descriptor. Each feature is represented by a 128- dimensional vector. All features gathered from database video frames are clustered using approximate K-Means al- gorithm (AKM) with a very large number of codewords. Since the limitation of hardware computation, only 100 million features are randomly sampled to train 1 million codewords. These features are then quantized using the codebook with hard-assignment strategy. Finally, each video frame is represented by a very sparse BOW fea- ture vector using tf-idf (term frequency-inverse document frequency) weighting scheme. Because the rank list only counts video shots not video frames, we aggregate all BOW vectors of frames of a shot into a single one for compact representation and fast retrieval. Using the following en- coding scheme, frame j-th of i-th video shot is represented by a BOW feature vector Si,j . We accumulate all vectors of a shot into a single one using average pooling: Si = 1 n n∑ j=1 Si,j (1) where, n is number of key frames of the shot. Feature vectors of video shots are then transferred to build an inverted index which helps to significantly boost Persons-In-Places: a Deep Features Based Approach. . . Informatica 41 (2017) 149–158 153 Figure 3: Two images illustrate a location example (the left-hand side one) and a query person example (the-right hand side one). For the location example, there may have some irrelevant persons (marked by yellow boundaries) whose noisy visual words take part in the BOW feature vector of the frame. For the person example (marked by magenta boundary), face feature is one of the most important features for retrieving. the speed of retrieval. The similarity between the i-th shot and the given location is computed by the following for- mula: LSi = 1 n′ n′∑ k=1 asym(Lk, Si) (2) where, n′ is the number of query examples and asym is an asymmetrical similarity score[34]. Top K shots returned by BOW model are then reranked in the next steps. One important parameter in this initial step is K, the threshold for selecting top ranked shots. By observing the z-score normalized distance of all query ex- amples, we found that they have the same distribution as shown in Figure 4. Intuitively, we fixed the cut off thresh- old for top K shots is −2.5. The main assumption of BOW model is that two similar objects share significant number of local patches that can be matched against each other. The chosen query examples are often captured in perfect views due to the meticulous- ness of user while database frames are not always. When changing point of view significantly, local feature based BOW model gives bad retrieval performance. To be more robust with point of view, we represent each video frame by a high-level feature vector derived from a fully connected layer of CNN network. We use a very deep pre-trained network, i.e. VGG-19, and remove the last layer which commonly used for classification task. Video frames are re-sized and normalized before transferring to the feed for- ward network. The output of the network is a 4096 dimen- sional feature vector representing the whole video frame. Comparing two video frames is equivalent to comparing their representing feature vectors. However, using sym- metric metric such as Euclidean distance (L2) may result in low accuracy since all components of a feature vector have the same role. In fact, for each location, some of the components are important. A learning method is proposed to magnify the role of these key components. 3.2 Face feature learning for reranking The second part of the query is person examples. Face recognition is a very popular approach to identify a per- son. Faces are detected using DPM cascade detector [32] applied in maximum 5 key frames per shot. Then, face feature vector are extracted using VGG-Face descriptor, a CNN based network[15]. Particularly, each face image is represented by a 4096 dimensional deep feature vector. After this stage, each person is represented by a set of deep feature vectors {F1, F2, ..., Fm} where m is the num- ber of face examples. We perform similarly to each frame of a video shot. SFi,j,k represents feature vector extracted from a face of a person in a video frame. A natural way to compute the similarity between a person and a shot is to take the minimum distance between all pairs of face feature vectors. The distance formula is given as following: FSi = min l,j,k L2(F ∗ l , S ∗ Fi,j,k ) where F ∗l and S ∗ Fi,j,k are normalized vector of Fl and SFi,j,k , L2 is Euclidean distance metric. Although this feature is designed to work with L2 dis- tance metric, there is a big gap in performance. This could be explained that a face feature vector does not have the same weight for all components. With each face, the weights of components are different. Therefore, we pro- pose to learn these features by a large margin classifier with a linear kernel. Each face candidate of a frame of a shot af- ter transferred to the classifier will be scored by a value. Positive values indicate positive example, and vice versa. In this paper, we use Support Vector Machine (SVM) with linear kernel to train face features of the target person. Positive features are chosen from the query examples while negative ones are from the last 50 persons of the initial rank list returned from L2 distance based approach. After train- ing with SVM algorithm, the target person is represented 154 Informatica 41 (2017) 149–158 V.-T. Nguyen et al. Figure 4: Distribution of z-score normalized distance. by a single model M . 3.3 Final fusion This is our main contribution module which leverages the power of BOW model, deep features and machine learn- ing. At first, the rank list returned by BOW based loca- tion search is then used as the input of geometric verifi- cation step. Visual words of each database video frame is then verified using RANSAC algorithm. The number of inliers represents the similarity between a video frame and query location. The output of geometric verification step is the input of the irrelevant location removal step. Using classifier learned from location examples, we classify each video frame of a shot using linear kernel approach. The output score of a shot is the average of all decision values of frames in that shot. We remove shots which have neg- ative decision values and transfer the remained ones to the next step. In the face based reranking step, we use the face model learned from query examples to recognize persons of a video shot. The output score of shot i-th in this step is the maximum decision value of all frames that belong to: scorei = max j,k svm(M,S∗Fi,j,k) where M is the face model, S∗Fi,j,k is normalized vector of SFi,j,k and svm is the linear classifier. If scorei > 0, it means that there is at least one frame containing the query person in shot i-th and vice versa. The final step of our system is scene tracking. To deal with cases that the target person appears in a shot but his face is unclear, we transfer the decision value from the last positive shot to the next ones with small decreasing. Note that, we only apply scene tracking to shots which have neg- ative decision values. Assume that two consecutive shots i-th and i+1-th have scores scorei > 0 and scorei+1 ≤ 0. We update scorei+1 = 12scorei. We also update for the maximum 5 shots with the same factor. The output of this step is the rank list after sorting final score values in de- scending order. 4 Experiment 4.1 Dataset To demonstrate the advantage of the proposed method on different types of query, we used TRECVID In- stance Search (INS) datasets for evaluation. We used the TRECVID INS benchmarks in year 2016 which was re- leased by NIST. For experimentation, we name this dataset as INS2016. For the past six years (2010-2015) the instance search task has tested systems on retrieving specific instances of objects, persons and locations. They share the same collec- tion of test videos with a master shot reference. Currently, new query type will be tested by asking systems to retrieve specific persons in specific locations. The dataset contains approximately 244 video files extracted from the BBC Eas- tEnders program with totally 300 GB in storage and 464 hours in duration. Each query topic of INS2016 consists of two set of examples: location and person. For the per- son set, each example includes an image and corresponding mask to delimit the target entity with others. For location set, only image examples are provided. This INS dataset is very challenging due to the variety in query types: from indoor to outdoor location, unclear to clear person. Evaluation Protocol. There are 30 query topics or pairs of person-location and about 470 thousand video shots in this challenge. The system must return top 1000 shots that are most similar to each given topic. The ground truth files for each query are created manually and provided by TRECVID organization. To evaluate the performance of each method, we use the mean average precision (MAP) as a standard measurement. Although some evaluations of intermediate results such as location search when com- bining deep features and BOW are expected, there already has some reports about the performance of state-of-the-art systems on individual query of last year challenges[13]. Therefore, in this paper, we only take care about the per- formance of compound query. 4.2 Retrieval performance and visualization In this section, we discuss some quantitative results of our method evaluated against the ground truth gathered from the TRECVID INS 2016. For ease of observation, we use the following abbreviations with descriptions: – Avg-Fusion: normalized scores of person and location fusion. – L2-Reranking: using our framework, after geometric verification step, we rerank the initial top K list using L2 distance for face features. The similarity score of a frame is the opposite number of min-min distance be- tween face examples and all face detected in frames of a shot. The similarity score of a frame is the opposite number of that distance value. We use mean function for all similarity scores of frames in a shot to represent Persons-In-Places: a Deep Features Based Approach. . . Informatica 41 (2017) 149–158 155 Table 1: Comparison between average fusion and reranking methods. Run MAP Avg-Fusion 15.6 L2-Reranking 18.9 the final similarity (average pooling) (similar to other methods in the experiment). – CNN-Loc+L2-Reranking: similar to L2-Reranking but we augment the CNN based location reranking step after geometric reranking step. – Linear Kernel: similar to the baseline CNN-loc+L2- Reranking but we use linear kernel to learn face model of the query person and compute similarity score with candidate faces. – Linear Kernel+scene tracking: similar to the Linear Kernel, but we also apply scene tracking to deal with frames that face of target person is not detected. 4.3 Average fusion for person-location query In many systems, average fusion is one of the simple and effective methods to improve the retrieval performance. However, for compound queries such as location-person, average fusion is not good as face based reranking method as shown in Table 1. It can be explained that, the scores of each target location and person are independent and incom- parable. Moreover, frames with very clear and recogniz- able faces often have large proportions in appearance but less information about the context scene. Hence, a frame has higher score in recognizing a face may have lower score or low rank for a location, and vice versa. 4.4 Deep feature for location reranking In this section, we want to illustrate that, deep feature for reranking improves the performance pretty much even for rich-texture query object such as location. The experimen- tal result is shown in Table 2. Past state-of-the-art systems of TRECVID showed that, for rich-textured object such as location, local feature based BOW model is one of the most suitable choices. However, in case of real life videos, the proportion of location evidences is very small. Using CNN features of the query location, the system has more infor- mation to keep scenes that seems to be removed by the cut- off threshold in geometric verification step. 4.5 Face feature learning and scene tracking Table 3 summarizes the results of using our different meth- ods, measuring their relative performance in terms of the Table 2: Comparison of retrieval systems with and without high-level feature reranking. Run MAP L2-Reranking 18.9 CNN-Loc+L2-Reranking 19.8 Table 3: Experimental results on different configurations for TRECVID INS 2016. Run MAP Linear Kernel + scene tracking 50.6 Linear Kernel 25.9 CNN-Loc+L2-Reranking 19.8 MAP score. From the table, we can see that the first proposed method (Linear Kernel) performs much better than the baseline one which only uses L2 distance (CNN- Loc+L2-Reranking), showing a gain in the MAP from 19.8% to 25.9%. Moreover, with scene tracking step, the final performance is significantly boosted from 25.9% to 50.6%. Also, note that the scene tracking step not only keeps the high precision but also improves the recall compared to Linear Kernel method. Because there are many cases that the target persons do not put their faces in front of the camera, hence many shots are lost in the final rank list. By using scene tracking, the total recall of the re- trieval system is improved surprisingly. This can be ob- served on the precision-recall curves as shown in Figure 5 where the curve of Linear Kernel+scene tracking is signif- icantly higher than the other ones. To show the efficiency of the proposed method compared to the baseline system, we visualize the rank list returned from the systems. The query topic is given in Figure 1. Top six shots returned from the system using L2 distance and Linear Kernel classifier are visualized in Figure 6. Each row shows the key frames of a shot of a rank list. When us- ing L2 distance, the precision is very low, that is the reason why top six rank list of the baseline contains many irrel- evant shots marked by red bounding boxes. Using Linear Kernel classifier, the precision of the system is improved significantly, hence the ratio of relevant shots is very high. 5 Conclusion Inspired by recent successes of deep learning techniques, in this paper, we attempt to leverage the powerful of deep fea- ture in instance search task. We aim to use deep feature as a tool for reranking the location search result by bridging the semantic gap made by BOW model. Moreover, to search for more difficult object which is deformable and could be 156 Informatica 41 (2017) 149–158 V.-T. Nguyen et al. Figure 5: Precision recall curves when conducting experi- ment on TRECVID INS 2016. captured in different environments, we propose to apply a machine learning approach to learn deep features extracted from human face detected in video frame. In particular, we investigate a framework of combining BOW model and deep learning based feature with application to instance search task with a new type of query topic: a specific per- son in a specific location. By conducting experiments on a large-scale dataset, we proved that our proposed method significantly improves the performance of retrieval. In future work, we will investigate on advanced deep learning techniques such as retraining network with new data generated from query examples. We also evaluate the retrieval systems on other diverse datasets for more in- depth empirical studies. Acknowledgement The video frames from BBC Eastenders video used in this document are programme material copyrighted by BBC. This research is funded by Vietnam National Univer- sity HoChiMinh City (VNU-HCM) under grant number B2017-26-01. References [1] R. Arandjelović and A. Zisserman. Three things ev- eryone should know to improve object retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), CVPR ’12, pages 2911–2918, Washington, DC, USA, 2012. [2] R. Arandjelović and A. Zisserman. All about VLAD. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1578–1585, 2013. [3] A. Babenko and V. S. Lempitsky. Aggregating deep convolutional features for image retrieval. CoRR, abs/1510.07493, 2015. [4] A. Babenko, A. Slesarev, A. Chigorin, and V. Lem- pitsky. Neural codes for image retrieval. In D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, editors, Com- puter Vision – ECCV 2014: 13th European Con- ference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I, pages 584–599. Springer Inter- national Publishing, Cham, 2014. [5] Y. Cao, C. Wang, Z. Li, L. Zhang, and L. Zhang. Spatial-bag-of-features. In IEEE Conference on Com- puter Vision and Pattern Recognition, pages 3352– 3359, June 2010. [6] D. Chen, X. Cao, L. Wang, F. Wen, and J. Sun. Bayesian face revisited: A joint formulation. In Pro- ceedings of the European Conference on Computer Vision - Volume Part III, ECCV’12, pages 566–579, Berlin, Heidelberg, 2012. Springer-Verlag. [7] O. Chum, M. Perdoch, A. Mikulik, and J. Matas. To- tal recall ii: Query expansion revisited. In IEEE Con- ference on Computer Vision and Pattern Recognition, pages 889–896, Los Alamitos, CA, USA, 2011. [8] O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisser- man. Total recall: Automatic query expansion with a generative feature model for object retrieval. In IEEE International Conference on Computer Vision, 2007. [9] R. G. Cinbis, J. Verbeek, and C. Schmid. Unsuper- vised Metric Learning for Face Identification in TV Video. In ICCV 2011 - International Conference on Computer Vision, pages 1559–1566, Barcelona, Spain, Nov. 2011. IEEE. [10] H. Jegou, M. Douze, and C. Schmid. Hamming em- bedding and weak geometric consistency for large scale image search. In Proceedings of the European Conference on Computer Vision: Part I, ECCV ’08, pages 304–317, Berlin, Heidelberg, 2008. Springer- Verlag. [11] H. Jégou and A. Zisserman. Triangulation embed- ding and democratic aggregation for image search. In CVPR - International Conference on Computer Vision and Pattern Recognition, Columbus, United States, June 2014. [12] C. Lu and X. Tang. Surpassing human-level face ver- ification performance on lfw with gaussian face. In Proceedings of the AAAI Conference on Artificial In- telligence, AAAI’15, pages 3811–3819. AAAI Press, 2015. [13] P. Over, J. Fiscus, G. Sanders, D. Joy, M. Michel, G. Awad, A. Smeaton, W. Kraaij, and G. Quénot. Trecvid 2014 – an overview of the goals, tasks, data, evaluation mechanisms and metrics. In Proceedings of TRECVID 2014. NIST, USA, 2014. Persons-In-Places: a Deep Features Based Approach. . . Informatica 41 (2017) 149–158 157 Figure 6: Result visualization of query from Figure 1. a) Top 6 rank list using L2 distance. b) Top 6 rank list using Linear Kernel classifier. [14] O. M. Parkhi, K. Simonyan, A. Vedaldi, and A. Zis- serman. A compact and discriminative face track de- scriptor. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE, IEEE, 2014. [15] O. M. Parkhi, A. Vedaldi, and A. Zisserman. Deep face recognition. In British Machine Vision Confer- ence, 2015. [16] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisser- man. Object retrieval with large vocabularies and fast spatial matching. In Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition, 2007. [17] J. Philbin, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In In CVPR, 2008. [18] A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carls- son. Cnn features off-the-shelf: An astounding base- line for recognition. In Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition Workshops, CVPRW ’14, pages 512–519, Washing- ton, DC, USA, 2014. IEEE Computer Society. [19] S. Ren, K. He, R. Girshick, and J. Sun. Faster R- CNN: Towards real-time object detection with region proposal networks. In Neural Information Processing Systems (NIPS), 2015. [20] F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A unified embedding for face recognition and cluster- ing. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015. [21] X. Shen, Z. Lin, J. Brandt, S. Avidan, and Y. Wu. Object retrieval and localization with spatially- constrained similarity measure and k-nn re-ranking. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 3013–3020, June 2012. [22] K. Simonyan, O. M. Parkhi, A. Vedaldi, and A. Zis- serman. Fisher Vector Faces in the Wild. In British Machine Vision Conference, 2013. [23] J. Sivic and A. Zisserman. Video Google: A text retrieval approach to object matching in videos. In Proceedings of the International Conference on Com- puter Vision, volume 2, pages 1470–1477, Oct. 2003. [24] Y. Sun, Y. Chen, X. Wang, and X. Tang. Deep learning face representation by joint identification- verification. In Proceedings of the International Con- ference on Neural Information Processing Systems, NIPS’14, pages 1988–1996, Cambridge, MA, USA, 2014. MIT Press. [25] Y. Sun, D. Liang, X. Wang, and X. Tang. Deepid3: Face recognition with very deep neural networks. CoRR, abs/1502.00873, 2015. [26] Y. Sun, X. Wang, and X. Tang. Deep learning face representation from predicting 10,000 classes. In Pro- ceedings of the IEEE Conference on Computer Vision 158 Informatica 41 (2017) 149–158 V.-T. Nguyen et al. and Pattern Recognition, CVPR ’14, pages 1891– 1898, Washington, DC, USA, 2014. IEEE Computer Society. [27] Y. Sun, X. Wang, and X. Tang. Deeply learned face representations are sparse, selective, and robust. CoRR, abs/1412.1265, 2014. [28] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deepface: Closing the gap to human-level perfor- mance in face verification. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014. [29] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Web- scale training for face identification. In The IEEE Conference on Computer Vision and Pattern Recog- nition (CVPR), June 2015. [30] G. Tolias and Y. S. Avrithis. Speeded-up, relaxed spatial matching. In IEEE International Conference on Computer Vision, ICCV 2011, Barcelona, Spain, November 6-13, 2011, pages 1653–1660, 2011. [31] J. Wan, D. Wang, S. C. H. Hoi, P. Wu, J. Zhu, Y. Zhang, and J. Li. Deep learning for content-based image retrieval: A comprehensive study. In Proceed- ings of the ACM International Conference on Mul- timedia, MM ’14, pages 157–166, New York, NY, USA, 2014. ACM. [32] L. Wolf, T. Hassner, and I. Maoz. Face recognition in unconstrained videos with matched background simi- larity. In in Proc. IEEE Conf. Comput. Vision Pattern Recognition, 2011. [33] W. Zhang and C.-W. Ngo. Searching visual in- stances with topology checking and context model- ing. In Proceedings of the ACM Conference on Inter- national Conference on Multimedia Retrieval, ICMR ’13, pages 57–64, New York, NY, USA, 2013. ACM. [34] C. Zhu, H. Jegou, and S. Satoh. Query-adaptive asym- metrical dissimilarities for visual object retrieval. In IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, December 1-8, 2013, pages 1705–1712. IEEE, 2013. Informatica 41 (2017) 159–168 159 Another Look at Radial Visualization for Class-preserving Multivariate Data Visualization Van Long Tran University of Transport and Communications, Hanoi, Vietnam E-mail: vtran@utc.edu.vn Keywords: data visualization, radial visualization, quality visualization Received: March 24, 2017 Multivariate data visualization is an interesting research field with many applications in various fields of sciences. Radial visualization is one of the most common information visualization concept for visualizing multivariate data. However, radial visualization may display different information about structures of multivariate data. For example, all points which are multiplicatives of given point may map to the same point in the visual space. An optimal layout of radial visualization is usually found by defining a suitable the order of data dimensions on the unit circle. In this paper, we propose a novel method that improves the radial visualization layout for cluster preservation of multivariate data. The traditional radial visualizations have viewpoint at the origin coordinate. The idea of our proposed method is finding the most suitable viewpoint among the corners of a hypercube to look into the cluster structures of data sets. Our method provides an improvement in visualizing class structures of multivariate data sets on the radial visualization. We present our method with three kinds of quality measurements and prove the effectiveness of our method for several data sets. Povzetek: Predstavljena je vizualizacija multivariantnih podatkov. 1 Introduction Many scientific and business applications produce large data sets with increasing complexity and dimensionality. While information is growing in an exponential way, data are ubiquitous in our world. Data should contain some kind of valuable information that can possibly be explored using human knowledge. However, extracting meaningful infor- mation in large scale data is a difficult task. Information visualization techniques have been proven to be of high value in gaining insight into these large data sets. The aim of information visualization is to use the computer-based interactive visual representations of ab- stract and non-physically based data to amplify human cog- nition. It aims at helping users to detect effectively and ex- plore the expected, as well as discovering the unexpected, to gain insight into the data [6]. A major challenge for information visualization is how to present multidimensional data to analysts, because com- plex visual structures occur. Data visualization methods of- ten employ a map from multidimensional data into lower- dimensional visual space. The reason is that visual space representation is composed of two or three spatial coordi- nates and a limited number of visual factors such as color, texture, etc. However, when the dimensionality of the data is high, usually from tens to hundreds, the mapping from multidimensional data space into visual space imposes in- formation loss. This leads to one of the big question in information visualization [6]: How to project from a mul- tidimensional data space into a low-dimensional space and best preserve the characteristics of the data. The order of data dimensions is a crucial problem for the effectiveness of many multidimensional data visualiza- tion techniques [3] such as parallel coordinates [13], star coordinates [14], Radial visualization (Radviz) [10], scat- terplot matrix [2], circle segments [4], and pixel recursive pattern [15]. The data dimensions have to be positioned in some one- or two- dimensional arrangement on the screen. The chosen arrangement of data dimensions can have a major impact on the expressiveness of the visualization because the relationships among adjacent dimensions are easier to detect than relationships among dimensions po- sitioned far from each other. Dimension ordering aims to improve the effectiveness of the visualization by giving rea- sonable orders to the dimensions so that users can easily detect relationships or pay more attention to more impor- tant dimensions. The Radviz technique is one of the most common visu- alization techniques used in medical analysis [10, 11, 16]. Finding the optimal order of data dimensions in Radviz is known to be NP-complete [3]. Although there have been a number of proposed methods for solving the dimension ordering problem in Radviz [16, 8], most of them are ex- haustive or greedy local searches in the space of all permu- tations of data dimensions. These methods are usually only tested on some data sets with small number of dimensions. One of the disadvantages of Radviz is that all multidi- mensional points which differ by a multiplicative constant, i.e., all points cp with a fixed point p and various non-zero 160 Informatica 41 (2017) 159–168 V.L. Tran scalars c, number that map to the same position in the vi- sual space. Thus, all these points separate in the original space but they cannot be differentiated in the visual space. This property is invariant for all permutations. Radviz can be explained as a combination of a perspective projection and a linear mapping with the viewpoint at the origin and the view plane being a simplex. In this paper, we propose another variant of Radviz that supports users visualizing the data inside a hypercube from an arbitrary viewpoint at the corners of the hypercube. Finding a suitable viewpoint of the hypercube in an n-dimensional space has 2n possible cases. In general, finding a good viewpoint is less compli- cated than finding a good data dimensions permutation of Radviz. The remaining part of this paper is organized as follows. In Section 2, we present related work with Radviz and data dimensions reordering in multivariate data visualiza- tion techniques. The inversion axes in Radviz are presented in Section 3. In Section 4, we describe some methods for measurement quality of class visualizations for multivari- ate data in the visual space. In Section 5, we show the effectiveness of our methods with five well known multi- variate data sets in the case of classified data. In Section 6, we make a comparison for five data sets with permutations in Radviz with other algorithms. In Section 7, we present our conclusion and future work. 2 Related work Principal Component Analysis (PCA) is one of the most common methods for the analysis of multivariate data [12]. PCA is applied to visualizing multivariate data that is a linear projection onto two or three eigenvectors. The gen- eral linear mapping can be defined as P (x) = V x where V is a matrix. PCA projects a multidimensional point x into a space spanned by the two or three eigenvectors that corresponding to the two or three largest eigenvalues of the covariance matrix of the given data sets. Star coordinates were introduced by Kandogan [14]. Star coordinates use a linear mapping with the ith column of matrix transformation Vi = (cos 2πi n , sin 2πi n )T . Vec- tors {Vi, i = 1, 2, . . . , n} are represented evenly on the unit circle in the two-dimensional visual space. The au- thor also introduced several techniques for interactions on star coordinates, for example moving axes Vi in the vi- sual space. In [5], 3D star coordinates are introduced with Vi = (cos 2πi n , sin 2πi n , 1)T that extends the 2D star coor- dinates by adding the third coordinates as summation of all coordinates. Further properties can be found in [20, 17]. Long and Linsen [22] propose optimal 3D star coordi- nates for visualizing hierarchical clusters in multidimen- sional data. Radviz was proposed by Hoffman et al. [10]. Radviz can be explained as a perspective projection of the 3D star co- ordinates with a view point at the origin and viewing plane z = 1. A normalized Radviz and properties of Radviz are presented in [7]. The important problem with Radviz is the ordering of the dimensional anchors for a good viewing of the multivariate data. In [19], the t-statistic method for reordering dimensional anchors on the unit circle is intro- duced. The t-statistic is applied for labelled data. Di Caro et al. [8] proposed two methods for dimension arrangement in Radviz based on an optimization problem for pair of sim- ilarity matrix between data dimensions and neighbourhood matrix between data dimensions on a unit circle [8]. Albu- querque et al. [1] used the Cluster Density Measure (CDM) for finding a good layout of Radviz. The authors propose a greedy incremental algorithm to successively add data di- mensions to the Radviz layout to determine a suitable order. 3 Radial visualization method 3.1 Radviz Radviz was first introduced by Hoffman et al. in [10, 11], and it could be regarded as an effective non-linear dimen- sionality reduction method. Radviz directly maps multi- dimensional data points into a visual space based on an equibalance of spring systems. In Radviz, dimensional an- chors are attached to springs. The stiffness of each spring equals the value of the dimension corresponding to its di- mensional anchor. The other end of each spring is attached to a point in the visual space. The location of this point ensures the equibalance of the spring systems. Let x = (x1, x2, . . . , xn) be a data point in a hypercube [0, 1]n. The dimensional anchors Si, i = 1, 2, . . . , n can be easily calculated by the formula: Si = (cos 2π(i− 1) n , sin 2π(i− 1) n ), i = 1, 2, . . . , n. For the spring systems to be equibalanced, we must have n∑ i=1 xi(p−xi) = 0, and we have the location of p as follows: p = ∑n i=1 xiSi∑n i=1 xi . (1) Thus, the multidimensional point x is represented by the point p. Figure 1 shows how a sample x of an eight- dimensional space is represented by a point p in a 2- dimensional plot. The important properties of the Radviz method are de- scribed in [7]: – If a multidimensional point with all x coordinates have the same value, the data point lies exactly in the origin of the graph. Points with approximately equal dimensional values (after normalization) lie close to the center. Points with similar dimensional values, whose dimensions anchors are opposite each other on the circle lie near the center. Another Look at Radial Visualization for. . . Informatica 41 (2017) 159–168 161 Figure 1: Radviz visualizes a point in 8 dimensions. The dimensions are represented by points, placed equally spaced on the unit circle. An observation x is displayed at position p corresponding to its attributes x1, x2, . . . , x8. – If the point is a unit vector point, it lies exactly at the fixed point on the edge of the circle where the spring for that dimension is fixed. Points which have one or two coordinate values significantly greater than the others lie closer to the dimensional anchors (fixed points) of those dimensions. – The position of a point depends on the layout of the particular dimensional anchors around the circle. – Many points can be mapped to the same position. This mapping represents a non-linear transformation of the data that preserves certain symmetries. – The Radviz method maps each data record to a point in a multidimensional data set that is within the convex hull of the dimensional anchors. We can consider the Radviz nonlinear mapping as a combination of a perspective projection with the viewer at o = (0, 0, . . . , 0) on a simplex n∑ i=1 xi = 1, V (x) = ( n∑ i=1 xi) −1x and a linear mapping as in the Star coordinates [14]LS(x) = n∑ i=1 xiSi. The Radviz mapping can be rewrit- ten as follows: R(x) = LS(V (x)) = ( n∑ i=1 xi) −1 n∑ i=1 xiSi. 3.2 Inversion Radviz We propose a method that supports users in viewing the hypercube at arbitrary corner of the unit hypercube. We assume that the view is a point p = (p1, p2, . . . , pn) ∈ {0, 1}n. The simplex at the point p is a hyperplane (πp) that goes through n points (p1, . . . , 1 − pi, . . . , pn), i = 1, 2, . . . , n. The equation of the simplex is determined as follows: n∑ i=1 (1− 2pi)xi = 1− n∑ i=1 pi. We can rewrite the above equation of the hyperplane as (πp) : ∑ pi=0 xi + ∑ pi=1 (1− xi) = 1. We find the position of the multidimensional point x = (x1, x2, . . . , xn) ∈ [0, 1]n in the visual space. The coor- dinates of the point x with respect to the origin p and the basic vectors( (1− 2p1)e1, (1− 2p2)e2, . . . , (1− 2pn)en ) , is denoted by xp = ( x1 − p1 1− 2p1 , x2 − p2 1− 2p2 , . . . , x2 − p2 1− 2pn ), or xp = ( p1 + (1− 2p1)x1, . . . , pn + (1− 2pn)xn ) , where (e1, e2, . . . , en) are the standard basic vectors ofRn. Obviously, the coordinates of the point x are the coordi- nates of the vector x − p with respect to the vector basic systems above. The perspective projection V maps a point xp onto the hyperplane (πp) at the point Vp(x) where Vp(x) = ( p1 + (1− 2p1)x1, . . . , pn + (1− 2pn)xn ) ∑ pi=0 xi + ∑ pi=1 (1− xi) . Figure 2 displays the viewpoint p, the view plane (πp), and the location Vp(x) of the multidimensional point x on the hyperplane (πp). Figure 2: The perspective projection at corner p. The Radviz projection at the point p is defined as P (x) = n∑ i=1 ( pi + (1− 2pi)xi ) Si∑ pi=0 xi + ∑ pi=1 (1− xi) , 162 Informatica 41 (2017) 159–168 V.L. Tran or P (x) = ∑ pi=0 xiSi + ∑ pi=1 (1− xi)Si∑ pi=0 xi + ∑ pi=1 (1− xi) . The ith coordinate of the point x corresponding to pi = 1 is changed to 1 − xi. We propose an inversion Radviz (iRadviz for short) to project the multidimensional point x onto the visual space as follows: Rp,S(x) = ∑ pi=0 xiSi + ∑ pi=1 (1− xi)Si∑ pi=0 xi + ∑ pi=1 (1− xi) (2) Figure 3 shows the Radviz and iRadviz to visualize a synthetic data set in three dimensional space that called as 3D data set. The 3D data set contains 700 points which split into seven clusters. Each cluster has 100 points at the seven vertices of the cube except vertex (1, 1, 1). Figure 3 (left) shows the traditional Radviz visualizing the 3D data set. One cluster at the origin (0, 0, 0) is spread on the simplex. Radviz visualizes three dimensional space data set that is not affected by permutation. Figure 3 (right) shows the 3D data set with iRadviz using viewpoint (1, 1, 1) where the seven clusters are perfectly separated. For interaction, users can select a dimensional anchor pi in Radviz and change this vertex into 1 − pi. For finding the optimal viewpoint of the iRadviz of the given data set, we need a quality measurement to define a suitable view of a multidimensional data set. 4 Quality measurement Suppose data set X = {xi : 1 ≤ i ≤ n} is classified into K classes and each class is labeled by C = {1, 2, . . . ,K}. We denote nk a the number of data points in the kth class. In this section, we present briefly three methods to measure quality in iRadviz for visualizing supervised data. Without loss of generality, we also denote the data set that is pro- jected in the visual space by X = {xi : 1 ≤ i ≤ n} ⊂ R2. 4.1 Class distance consistency For each class, we denote ck as the centroid of the kth class. A data point x belongs to a particular class if the distance from the data point x to the centroid of this class is smallest. Hence, we denote class(x) = arg min 1≤k≤K ||x− ck||. A data point x is correctly represented if its label is the same as its class, otherwise the data point x a miss. The Class Distance Consistency (CDC) [21] of data set X = {xi : 1 ≤ i ≤ n} is defined as the number of cor- rectly represented data points, i.e., Q(CDC, X) = |xi : label(xi) = class(xi)| n . (3) The CDC quality measurement for class visualization is ap- plicable for a spherical shape of clusters. 4.2 Cluster density measurement The quality Cluster Density Measurement (CDM ) [1] is defined as follows: Q(CDM, X) = K∑ i,j=1 d2ij rirj , (4) where dij = ||ci − cj || is the Euclidean distance between two cluster centroids, and ri is an average radius of the ith cluster, i.e., ri = ∑ label(x)=i ||x− ci|| ni . The high value quality presents well defined cluster sep- arations with small intra-cluster distances and large inter- cluster distances. Hence, the higher the quality measure is, the better is the visualization of the supervised data set. 4.3 Conditional entropy The Havrda-Charvat’s structural α-entropy [9] is defined as Hα(X) = 2α−1 2α−1 − 1 ( 1− n∑ i=1 pα(xi) ) , α > 0, α 6= 1. A conditional Havrda-Charvat’s structural α-entropy [18] for class visualization quality is defined as follows: Hα(C|X) = ∫ p(x)Hα(C|X = x)dx = 2α−1 2α−1 − 1 ( 1− K∑ j=1 ∫ pα(j|x)p(x)dx ) . We can estimate the conditional entropy Hα(C|X) as fol- lows: Hα(C|X) = 2α−1 2α−1 − 1 ( 1− 1 n K∑ j=1 n∑ i=1 pα(j|xi) ) . Assume each data point xi is classified into only one class, i.e., p(j|xi) = 1 for the jth class and p(j|xi) = 0 for any other class. The conditional entropy achieves minimal value. When α = 2, we have the quadratic entropy: H2(C|X) = 2 ( 1− 1 n K∑ j=1 n∑ i=1 p2(j|xi) ) . By Bayes’ theorem, we have p(j|x) = p(j)p(x|j) p(x) . Another Look at Radial Visualization for. . . Informatica 41 (2017) 159–168 163 Figure 3: The synthetic 3D data visualization. (Left) Traditional Radviz. (Right) iRadviz with viewpoint (1, 1, 1). The prior probability is estimated by p(j) = nj n . The density p(x|j) and p(x) are estimated by nonparamet- ric techniques as the Parzen window method. Consider a small region R(x) that contains x and has area V . Assume the region R(x) contains kj(x) points of the jth class and k(x) points of the data set. We estimate the density by p(x|j) = kj(x) njV , and p(x) = k(x) nV . Therefore, the conditional probability p(j|x) can be estimated by p(j|x) = nj n kj(x) njV k(x) nV = kj(x) k(x) . The quality entropy is defined as following Q(ENT,X) = 1− 1 n n∑ i=1 K∑ j=1 (kj(xi) k(xi) )2 (5) The lower the quality entropy is, the better is the clustering visualization. For calculating the entropy quality, we di- vide the square region that contains all data set into N ×N grid cells. The grid size N in two-dimensional space is es- timated by the k-nearest neighbor. For each cell c, we have 9 neighbor cells, and on average in 9 cells we have 9n N2 points. The grid size N is calculated by 9n N2 = √ n or N = 1 + [ 3 4 √ n ] . For each cell c, we store the class point counts c = (c1, c2, . . . , cK), where cj is the number of point of the jth class falling into the cell c. For each point x that falls in the cell c, region R(x) contains all cells that are neighbors with the cell c. We have kj(x) = ∑ c∈R(x) cj and k(x) = ∑ c∈R(x) kj(x). The complexity for computing the entropy quality is O(Kn), i.e., it has linear time complex- ity. 5 Experimental results We tested our approach with five data sets. For each data set, we find the viewpoint for the iRadviz based on the three quality measurements presented in the Section 4. The first well known data set is called the Iris data set1. The Iris data set contains 150 data points, four attributes: X1 (sepal length), X2 (sepal width), X3 (petal length), X4 (petal width) and three classes: Setosa (50 data points), Versicolour (50 data points), and Virginica (50 data points). Figure 4 shows the iRadviz approach for visualizing the Iris data set. Classes are encoded by different colors. One class (red) is separated perfectly with two other classes. In Figure 4 (left) with inversion of the axes X2, X3, X4 and Figure 4 (right) with inversion of the axes X1, X2, X3, X4. These figures show three classes better separated than in Figure 4 (middle) without inversion the axes. The second data set is named the Wine data set2. The Wine data set includes 178 data points with 13 attributes: X1(Alcohol), X2 (Malic acid), X3 (Ash), X4 (Alcalin- ity of ash), X5 (Magnesium), X6 (Total phenols), X7 (Flavanoids), X8 (Nonflavanoid phenols), X8 (Proantho- cyanins), X10 (Color intensity), X11 (Hue), X12 (OD280 / 1http://archive.ics.uci.edu/ml/datasets/Iris 2http://archive.ics.uci.edu/ml/datasets/Wine 164 Informatica 41 (2017) 159–168 V.L. Tran Figure 4: The Iris data. (Left) The best iRadviz visualization based on CDC quality. (Middle) The best iRadviz visualiza- tion based on CDM quality. (Right) The best iRadviz visualization based on Entropy quality. Figure 5: The Wine data. (Left) The best CDC quality of iRadviz visualization. (Middle) The best quality CDM of iRadviz visualization. (Right) The best quality Entropy of iRadviz visualization. Figure 6: The Y14c data. (Left) The best quality CDC on iRadviz. (Middle) The best quality CDM on iRadviz. (Right) The best quality Entropy on iRadviz. OD315 of diluted wines), and X13 (Proline). The Wine data set is classified into three classes: class 1 (59 data points), class 2 (71 data points), and class 3 (48 data points). Figure 5 shows the Wine data set with a differ- Another Look at Radial Visualization for. . . Informatica 41 (2017) 159–168 165 ent viewpoint using iRadviz. The different colors repre- sent different classes of the Wine data set. Figure 5 (left) shows the best iRadviz visualization for the Wine data set with highest CDC quality where inversion was applied to axes X4, X5, X7, X10. Figure 5 (middle) shows the best iRadviz visualization for the wine data set with highest CDM quality where inversion has been applied to axes X1, X2, X3, X4, X8, X9, X11, X12, X13. Figure 5 (right) shows the best iRadviz visualization for the wine data set with highest Entropy quality where inversion has been ap- plied to axes X6, X7, X10. The third data set is a synthetic data set, that contains 480 data points with ten attributes and partitions into 14 clusters. Figure 6 shows three views of the Y14c data with several different viewpoints in iRadviz. In this figure, the inversion axes are highlighted by red color. Figure 6 (left) shows the best iRadviz class visualization of this data on the CDC quality with inversion axes 2, 3, 4, 5, 6, 7. Clusters shown in this figure are well separated. Figure 6 (middle) shows the best iRadviz based on highest CDM quality with inversion axes 1, 2, 3, 6, 10. Several clusters are overlap- ping in this visualization. Figure 6 (right) shows the best iRadviz based on highest Entropy quality with inversion axes 1, 2, 3, 4, 6, 9. This figure shows that clusters are per- fectly separated. The Y14c data set contains two clusters that have an different a scale. These clusters are fully over- lapped on the Radviz with any permutation of dimensional anchors. The fourth data set is named Italian Olive Oils data (Olive for short)3. The Olive data set consists of eight attributes about eight fatty acids (X1 palmitic, X2 palmi- toleic, X3 stearic, X4 oleic, X5 linoleic, X6 linolenic, X7 arachidic, X8 eicosenoic) and 572 data samples. The Olive data set is classified into nine clusters. Each cluster cor- responds to one of nine areas in Italy. Figure 7 shows the iRadviz class visualization of the Olive data set that shows the best quality based on CDC (left), CDM (middle), and Entropy (right). Figure 7 (left and right) classes are more separated than the classes in Figure 7 (middle). The last data set is called Ecoli4. The Ecoli data contains 336 data samples and each data sample consists seven at- tributes. The Ecoli data set is partitioned into eight clusters with 143, 77, 52, 35, 20, 5, 2, 2 data samples respectively. The three last clusters contain very small data amounts of samples. Figure 8 shows the class visualization using iRad- viz with the best quality based on CDC (left), CDM (mid- dle), and Entropy (right). 6 Comparison and discussion In this section, we present some quality measurements of our proposed method versus permutation and our method versus other algorithms. 3http://cran.r-project.org/ 4https://archive.ics.uci.edu/ml/datasets/Ecoli 6.1 Inversion dimension versus permutation For the three first data set (Iris, Ecoli, and Olive) data sets, we find the global best permutation for each quality mea- surements by searching over all permutations. The two last data sets (Y14c and Wine), we find the local best permuta- tion. We call two instances permutations of data dimension if they are different by one consecutive position. The local best permutation achieves the best quality over all neighbor permutations. Class Distance Consistency: Table 1 shows that the quality of our approach is better than the CDC quality in [21] for the Iris, Ecoli, Y14c, and Wine data sets and is slightly lower than the CDC quality for the Olive data set. Cluster Density Measurement: Table 2 shows that the CDM quality of our approach is better than the CDM qual- ity in [2] for the two last data sets, lower for the Ecoli and Olive data sets, and the same for the Iris data set. Entropy Measurement: Table 3 shows that the Entropy quality of our approach is better than the Entropy quality in [18] for the Iris, Ecoli, and Y14c data sets, and is slightly lower for the Olive and Wine data sets. 6.2 Inversion axes with other permutation algorithms In this section, we present the quality measure- ment comparison of our method versus the t-statistic method and the CDM method about the permuta- tion on the Radviz [1]. The best permutation in Radviz for the Wine data by t-statistic method is {1, 2, 4, 8, 10, 11, 13, 12, 9, 7, 6, 5, 3}, and the CDM method delivers {8, 3, 4, 2, 10, 13, 1, 5, 6, 7, 9, 12, 11}. The best permutation in Radviz for the Olive data by t-statistic method is {1, 2, 5, 4, 8, 7, 3, 6}, and the CDM method de- livers {1, 3, 4, 7, 6, 2, 8}. Table 4 shows the quality measurements CDC, CDM, and Entropy (ENT) for the Olive and Wine data sets. The overall quality measurements of our approach are better than those of the t-statistic and CDM methods except for the Entropy quality measure applied to the Wine data set. Figure 9 (left) shows Radviz visualizing the Wine data set with the best permutation by the t-statistic method and Figure 9 (right) shows the Radviz visualizing the Wine data set with the best permutation by the CDM method. In com- parison, Figure 5 shows the Wine data set over the inversion axes. The Figure 9 (left) shows lowest quality for class sep- aration for the Wine data set, while Figure 5 (left) shows the highest quality for class separation. Figure 10 shows the Olive data set with the two best per- mutations using the t-statistic method (left) and the CDM method (right). Comparison with the inversion axes lay- out is provided in Figure 7. Figure 7 (left) and Figure 10 166 Informatica 41 (2017) 159–168 V.L. Tran Figure 7: The Olives Oil data. (Left) The best quality CDC on iRadviz. (Middle) The best quality CDM on Radviz. (Right) The best quality Entropy on iRadviz. Figure 8: The Ecoli data set. (Left) The best quality CDC on iRadviz. (Middle) The best quality CDM on iRadviz. (Right) The best quality Entropy on iRadviz. Figure 9: The Wine data. (Left) The best permutation by t-statistic method. (Right) The best permutation by CDM method. Another Look at Radial Visualization for. . . Informatica 41 (2017) 159–168 167 CDC Iris Ecoli Olive Y14c Wine Permutation 84.67% 67.56% 82.34% 93.96% 94.94% iRadviz 94.00% 78.57% 80.24% 100% 96.63% Table 1: The best CDC function over permutation and inversion axes. Quality CDM Iris Ecoli Olive Oil Y14c Wine Permutation 44.242 42.457 27.825 358.37 13.914 iRadviz 44.242 32.325 23.078 459.824 16.634 Table 2: The Best CDM function over permutation and inversion axes. Entropy Iris Ecoli Olive Oil Y14c Wine Permutation 0.1316 0.2057 0.1198 0.0648 0.0084 iRadviz 0.0028 0.1645 0.1281 0.000 0.0261 Table 3: The Best Entropy function over permutation and inversion axes. Data Olive Wine Method CDC ENT CDC ENT t-statistic 55.95% 0.4090 75.28% 0.1643 CDM 76.57% 0.1826 88.87% 0.0176 Our method 80.02% 0.1281 96.63% 0.0261 Table 4: The quality measurement for the Olive and Wine data sets. Figure 10: The Olives Oil data. (Left) The best permutation with CDC quality. (Right) The best permutation with Entropy quality. (right) have the lowest quality for class separation in the visual space while Figure 7 (left and right) exhibits higher quality for class separation for both permutations. 7 Conclusion We have presented a new method for visualizing multidi- mensional data based on Radial visualization. Our pro- posed method supports users choosing a suitable view for data sets in hypercube. We proved the effectiveness of our method versus permutation dimensional anchors on the Radviz for some supervised data both synthetic and real. 168 Informatica 41 (2017) 159–168 V.L. Tran For future work, we want to improve our method to en- hance class structures in subspaces with supervised data sets. Moreover, we want to develop other quality measure- ments for supervised data sets. Acknowledgement This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01-2012.04. References [1] G. Albuquerque, M. Eisemann, D. J. Lehmann, H. Theisel, and M. Magnor. Improving the visual analysis of high-dimensional datasets using quality measures. In IEEE Symposium on Visual Analytics Science and Technology (VAST), 2010, pages 19–26, 2010. [2] G. Albuquerque, M. Eisemann, D. J. Lehmann, H. Theisel, and M. A. Magnor. Quality-based vi- sualization matrices. In Proceedings of the Vision, Modeling and Visualization Workshop 2009 (VMV), Braunschweig, Germany, pages 341–350, 2009. [3] M. Ankerst, S. Berchtold, and D. A. Keim. Simi- larity clustering of dimensions for an enhanced visu- alization of multidimensional data. In Proceedings IEEE Symposium on Information Visualization (Info- Vis ’98), 1998, pages 52–60, 1998. [4] M. Ankerst, D. A. Keim, and H.-P. Kriegel. Circle segments: A technique for visually exploring large multidimensional data sets. Proceedings of the 1996 IEEE Symposium on Information Visualization, Hot Topic Session, San Francisco, CA, 1996. [5] A. O. Artero and M. C. F. de Oliveira. Viz3d: Ef- fective exploratory visualization of large multidimen- sional data sets. In The 17th Symposium on Com- puter Graphics and Image Processing 2004, Brazil- ian, pages 340–347, 2004. [6] S. K. Card, J. D. Mackinlay, and B. Schneiderman. Readings in information visualization: using vision to think. Morgan Kaufmann, 1999. [7] K. Daniels, G. Grinstein, A. Russell, and M. Glidden. Properties of normalized radial visualizations. Infor- mation Visualization, 11(4):273–300, 2012. [8] L. di Caro, V. Frias-martinez, and E. Frias-martinez. Analyzing the role of dimension arrangement for data visualization in radviz. In Advances in Knowledge Discovery and Data Mining, pages 125–132, 2010. [9] J. Havrda and F. Charvát. Quantification method of classification processes: Concept of structural α- entropy. Kybernetika, 3(1):30–35, 1967. [10] P. Hoffman, G. Grinstein, K. Marx, I. Grosse, and E. Stanley. Dna visual and analytic data mining. In Proceedings of the 8th conference on Visualization 1997, pages 437–441, 1997. [11] P. Hoffman, G. Grinstein, and D. Pinkney. Dimen- sional anchors: a graphic primitive for multidimen- sional multivariate information visualizations. In Pro- ceedings of the 1999 workshop on new paradigms in information visualization, pages 9–16, 1999. [12] J. Ian. Principal Component Analysis. Wiley Online Library, 2005. [13] A. Inselberg. The plane with parallel coordinates. The Visual Computer, 1(2):69–91, 1985. [14] E. Kandogan. Star coordinates: A multi-dimensional visualization technique with uniform treatment of di- mensions. In Proceedings of the IEEE Information Vi- sualization Symposium 2000, volume 650, pages 4–8, 2000. [15] D. A. Keim, M. Ankerst, and H.-P. Kriegel. Recur- sive pattern: A technique for visualizing very large amounts of data. In Proceedings of the 6th Confer- ence on Visualization’95, pages 279–286, 1995. [16] G. Leban, B. Zupan, G. Vidmar, and I. Bratko. Vizrank: Data visualization guided by machine learn- ing. In Data Mining and Knowledge Discovery 13, pages 119–136, 2006. [17] D. J. Lehmann and H. Theisel. Orthographic star co- ordinates. IEEE Transactions on Visualization and Computer Graphics, 19(12):2615–2624, 2013. [18] X. Li, K. Zhang, and T. Jiang. Minimum entropy clustering and applications to gene expression anal- ysis. In Computational Systems Bioinformatics Con- ference, (CSB 2004), pages 142–151, 2004. [19] J. McCarthy, K. Marx, P. Hoffman, A. Gee, P. O’Neil, M. Ujwal, and J. Hotchkiss. Applications of machine learning and high-dimensional visualization in can- cer detection, diagnosis, and management. Annals of the New York Academy of Sciences, 1020(1):239–262, 2004. [20] M. Rubio-Sanchez and A. Sanchez. Axis calibration for improving data attribute estimation in star coordi- nates plots. IEEE Transactions on Visualization and Computer Graphics, 20(12):2013–2022, Dec. 2014. [21] M. Sips, B. Neubert, J. P. Lewis, and P. Hanrahan. Selecting good views of high-dimensional data us- ing class consistency. Computer Graphics Forum, 28(3):831–838, 2009. [22] T. Van Long and L. Linsen. Visualizing high density clusters in multidimensional data using optimized star coordinates. Computational Statistics, 26(4):655– 678, 2011. Informatica 41 (2017) 169–182 169 Emotional Contagion Model for Group Evacuation Simulation Xuan-Hien Ta Toulouse University, UPS-IRIT, Toulouse, France E-mail: hientpbk@gmail.com Benoit Gaudou Toulouse University, UT1C-IRIT, Toulouse, France E-mail: benoit.gaudou@gmail.com Dominique Longin Toulouse University, CNRS-IRIT, Toulouse, France E-mail: Dominique.Longin@irit.fr Tuong Vinh Ho Vietnam National University, Hanoi, Vietnam E-mail: vinhht@vnu.edu.vn Keywords: emotion, simulation, agent-based model, GAMA platform, crisis situation, evacuation process Received: March 29, 2017 The key role of emotions in decision-making process of human beings has been highlighted recently. Our research focuses on fear-related emotions and their positive impact on the survival capabilities of human beings in case of crisis situations. In this paper, we proposed a new model of emotional contagion based on some main findings in social psychology. This model was formalized mathematically, implemented and tested in the GAMA agent-based simulation platform in the context of evacuation simulation. We assessed experimentally the impact of three factors (emotion decay, environment, neighbors’ emotional contagion) on emotion dynamics at individual and group levels. The experimental results allow us to understand the emotional contagion of agent group in several scenarios. The proposed model will help us to better study the impact of emotional contagion on evacuation safety in evacuation simulation.The entire theoretical model has been implemented in the simulation platform GAMA. Povzetek: Prispevek analizira čustva, povezana s strahom na primeru evakuacije. 1 Introduction Emotions, these reflexes that push human beings to make decisions quickly and without a deep and clear reasoning process, have been considered for a long time contrary to any other rational reasoning processes. Only recently the key role of emotions in decision-making process has been highlighted. We focus on fear-related emotions and their positive impact on the survival capabilities of human be- ings in case of crisis situations. Indeed, recent works have shown that emotion is a very important factor in the un- derstanding of human beings behaviours in crisis situations (see [9, 10, 28, 4] for instance). It has been studied for a long time in psychology and in philosophy, and more recently in cognitive sciences (see [27, 21, 31, 12] for instance). These works have shown the narrow relationship existing between an emotional state in a person and the action tendencies of this person. Indeed, emotions play a central role in cognition, especially when we need to react very quickly (what is the case in crisis situation). Instantaneously, emotions provide us a set of possible actions (called action tendencies for Lazarus [21]) that are strongly related to the situation. An emotion can be viewed as a summary of the situation, how this situation can affect ourselves, and what power we have on the real world in the aim to change the present situation in a positive one for us. So, emotions have a great power of explanation of our actions in crisis situations. In crisis situations, the most remarkable expression of the fear is definitely panic behaviors. While early re- searches on panic have presented panic as groundless fear or flight behavior, others describe it as a crowd in dissolu- tion. Nevertheless, in situation such as fire or disaster, [26] has shown that it is in fact a very meaningful behaviour and far from most conceptions of irrationality. The panic behaviour exists but is in fact quite rare. It is an individual behaviour, by opposition to a behaviour of the crowd, it is not contagious and occurs in short duration. It is not easy to be observed in crisis situations. Some particular conditions of panic triggering have been identified such as: perception of a great threat to self, a be- lief that escapes from the threat is possible but is very hard to achieve, and a feeling of helplessness [28, 14]. Some additional factors may also have an influence on triggered emotions such as experience in emergency situation and information. Information is the key to make a successful evacuation strategy during a crisis [29]. The sex and age of an individual can cause a different fear level. In addition, as it has been shown in [28], panic is not the predominant emotion in crisis situation. A lot of reports 170 Informatica 41 (2017) 169–182 X.-H. Ta et al. (see [11] for instance) show that when the danger increases, the mutual aid between humans exposed to this danger also increases. The persons share emotions and information, and they help each other, even if they were strangers each other before. There is a very few cases of selfish. One of the faces of this mutual aid is the constitution of groups of persons. People in a group of friends or in a family try to stay together every time it is possible. Sociological studies show that groups increase our chances to be saved [9] (evo- lutionary condition). In our previous work [32], we have studied the impact of group on the evacuation process. In this paper, we focus only on emotion contagion. In the simulation area, a lot of works focus more specif- ically on emotion contagion. For instance, in [24], the au- thors present simulations about relationships between emo- tions, information and beliefs. All members of a group can absorb the emotion of other members (in the same group) to create an average value of emotion. But they can also be influenced by the members of other groups. In this case, the average emotion of the group can be increased (am- plification) or decreased (absorption). We can understand the absorption of emotions as a bottom-up approach, and the amplification of emotion as a top-down approach. The authors propose the idea that agents with a high emotion (above a high threshold) or a low emotion (under a low threshold) will impact with different roles (increase or de- crease) depending on the characters of agent like the open- ness, the expressiveness, the capacity of receiving or ex- pressing from/to others. Similarly, in [5], the authors give another interesting orientation about the contagion of emo- tion among a group. In the GAMA agent-based simulation community [33, 17], several models (see [25, 22] for instance) have shown the important role played by emotions in emergency situa- tions. In [25], authors simulate the emotion dynamics in a group. They give a new operational model of the emotion contagion and implement the process of evacuation (avoid- ing both obstacles and the other agents). They evaluate the model with respect to the time of evacuation by applying many criteria. When the emotion intensity changes, the walking speed of the corresponding agents also changes and impacts the evacuation time. But we can also criticize here the fact that the emotion modeling is still very basic: we need a more complex cognitive model of emotions if we want to simulate agent behaviors as natural as possible. This article provides a new model of emotions dynamics. We focus here only on fear because this emotion plays an important role in crisis situations. We propose to model the emotion following three main findings both in cognitive psychology and in social psychology: 1. Emotions have triggering conditions (see [27, 21] for instance): this is a cognitive appraisal of these condi- tions that determines if they are fulfilled or not1. Fol- lowing these authors, fear is triggered when we per- 1By this assumption, we suppose here that emotion is in cognition: this is the point of view of the great majority of psychology community ceive a danger for our own life. Here, perception can be direct (an agent sees a fire or hears an alarm) or indirect (some other agents having fear influence the fear level of this agent). 2. Emotion intensity decreases with time: when trigger- ing conditions are not longer satisfied, an emotion does not disappear instantaneously (it is a process that takes time). 3. Finally, new perceptions from the environment (fires, alarms, influence of others) can modify the intensity level of fear that can increase or decrease. As far as we know, there is no model that takes into account all these factors in an intuitive manner. More precisely, a lot of factors may impact the emotion, but here we only take into account three main ones: environment (crisis percep- tion), emotional decay and contagion. The emotion model is implemented in GAMA 2 and is a part of a project about evacuation simulations in crisis situations. This paper is organized as follows. We first describe the model of emotion dynamics in Section 2. In Section 3, we assess the impact of the three factors (emotion decay and contagion, environment) on the emotion dynamics. Then we conduct the sensitivity analysis of the emotion model in Section 4. Finally, we conclude our work with some perspectives. 2 Model of emotion dynamics 2.1 Agent structure As presented above, this article focuses only on one emo- tion and its diffusion. So, the environment is described in a simple manner. In particular, there is neither obstacles nor exit doors (because both of them do not have any impact on our results). It will only contain some fire and human agents. Let AGT = {i, j, k, ...} be the finite set of human agents used in the simulation, FIRE = {f1, f2, ...} the finite set of fires and TIME = {t0, t1, ...} the finite set of time points where t0 is the initial state of the sim- ulation. The set of all the entities of the simulation is ENT = AGT ∪ FIRE . We denote by card(E) the car- dinality of the set E. So, card(AGT ) for instance is the number of agents and tcard(TIME)−1 is the final state of the simulation. Each agent i at time t is characterized by the 6-tuple 〈posi, visualRadiusi, neighbRadiusi, emDecayCoeff i, fireInflCoeff i, agtInflCoeff i〉 where: (see [21, 27, 12, 31] for instance) and this view is called “cognitive theory of emotion”. 2GAMA is a (open-source) generic agent-based modeling and simula- tion platform. It provides a lot of powerful tools to develop easily agent- based models, in particular using geographical data. In addition, GAMA allows the modeler to run simulation in either an interactive or a batch mode. This will allow us to launch experiment design in order to explore the model. Emotional Contagion Model for Group. . . Informatica 41 (2017) 169–182 171 – posi : TIME −→ R × R is the function that maps, for each time point t, the position posi(t) of agent i at time t. We extend this function to any entity e ∈ ENT . – visualRadiusi : TIME −→ R is the function that maps, for each time point t, the visual radius visualRadiusi(t) of i at time t. We consider here that each agent has its own percep- tion radius and that this perception radius can change during the evacuation process (because of smoke, fire, obstacle, etc.). In some scenarios, we suppose that the value d of vi- sual radius does not change over time and we note visualRadiusi = d. – neighbRadiusi : TIME −→ R is the function that maps, for every time point t, the neighborhood ra- dius neighbRadiusi(t) of i at time t. We impose that neighbRadiusi(t) ≤ visualRadiusi(t) for every agent i and time point t. In some scenarios, we suppose that the value d of neighborhood radius does not change over time and we note neighbRadiusi = d. – emDecayCoeff i ∈ [0, 1] is the decay coefficient of i’s emotion intensity (see Section 2.2). From a psycho- logical point of view, agents are more impressionable than others. It depends on personologic data [11] and we suppose here it does not change over time. – fireInflCoeff i ∈ [0, 1] is the fire influence coefficient on i. Due to the fact that some agents can be more ex- perienced in some dangers (as fire, for instance) than other agents, the impact of a given danger depends on the agent who faces this danger. The more an agent is experienced in a danger, the less its fire influence coefficient is high. – agtInflCoeff i : AGT −→ [0, 1] maps for ev- ery agent j ∈ AGT , the coefficient of influence agtInflCoeff i(j) of agent j on i. It is well-known in social influence literature (see [19, 15] for instance) that we are influenced by others from the point of view of beliefs, desires, norms, etc. It is the same with emotional states. But, due to the personality of each person, one can be more or less influenced by others. This coefficient of influence agtInflCoeff i(j) takes into account this aspect and the more this coefficient is high, the more agent i is influenced by the point of view of agents j. So, we are able to define the following abbreviations (for every e, e′ ∈ ENT , t ∈ TIME and i ∈ AGT ): distance(e, e′, t) def = || −−−−−−−−−−→ pose(t)pose′(t)|| detectedFiresi(t) def = { f ∈ FIRE : distance(i, f, t) ≤ visualRadiusi(t) } minDistFiresi(t) def = min ({ distance(i, f, t) : ∀f ∈ detectedFiresi(t) }) Ni(t) def = { j ∈ AGT : distance(i, j, t) ≤ neighbRadiusi(t) } distance(e, e′, t) is the distance between the positions of entity e and entity e′ at time t. detectedFiresi(t) is the set of fires in the visual radius of agent i at time t. minDistFiresi(t) is the minimal distance between agent i and all the fires it perceives at time t. We suppose here that, the more a fire is close to us, the more we are afraid by it. So, for the sake of simplicity, we suppose that the emo- tional reaction with respect to distant dangers is subsumed by the emotional reaction with respect to the closest dan- ger(s) that we perceive. So, only the closest fires are taken into account here. Ni(t) is the function that maps, for each time point t, the set of neighbors of agent i at time t. Finally, we will define in the next section the function fear i(t) that computes the fear level of the agent i for each time point t. At the initial time t0, fear i(t0) is fixed for each agent i. The fear level at time t > t0 is computed dynamically during the simulation steps. More precisely, the fear intensity change from time t−1 to time t (that is, the change from fear i(t− 1) to fear i(t)) is a three steps process depending on three different suc- cessive functions: 1. ∆fearDecay i(t) describes the lost of emotion inten- sity from t − 1 to t due to time. If fear i(t − 1) = 0 (that is, the fear level at time t − 1 is 0), then ∆fearDecay i(t) = 0; else, ∆fearDecay i(t) is the value that correspond to the lost of emotion intensity between t− 1 and t (see Section 2.2); 2. ∆fearEnv i(t): if the current fear level after decay is equal to 0 then a value (computed from a sigmoid function) is returned, else the variation of the fear be- tween t−1 and t is added. This variation is computed from the derivative of the sigmoide between t− 1 and t and corresponds to the effect of the fires that agent i detects around itself (if fires are detected) on its fear level (see Section 2.3); 3. ∆fearNeighbi(t): it is the variation of the fear (that can be positive or negative) coming from the influence of i’s neighbors. If these neighbors have a fear level 172 Informatica 41 (2017) 169–182 X.-H. Ta et al. that is lower than the fear level of i (after decay and influence of the environment), then the fear level of i will decrease, else it will increase (see Section 2.4). Finally, fear i(t) is the final new value of fear intensity at time t. It is defined as a composition of the above three components. Note that we could compute the fear level as the sum of three independent functions: one for the decay process, one for the environment influence process, and one for the neighborhood influence process. But such a sum could be less than 0 or to be greater than 1 (whereas we require that fear level is between 0 and 1). So, we prefer to compute the resulting emotion intensity as a composition of functions because it avoid such situations where the results could not be between 0 and 1. 2.2 Emotion Decay over Time As highlights in the literature [27, Chap. 4], without any stimulus, agents’ fear intensity will decrease over time. This decay is often described as faster for higher values of emotion intensity, and it slows down when the emotion intensity is low. At time t and for every agent i ∈ AGT , the value of the fear decay (the loss of emotion intensity) is noted ∆fearDecay i(t). This value is a function of the previ- ous emotion level at time t − 1 (fear i(t − 1)) and of emDecayCoeff i ∈ [0, 1] (the decay coefficient that de- pends on some attributes of each agent like genre, age, sex, etc. [11]). Moreover, we suppose that this decay coefficient does not vary over time. These requirements lead us to use the following function for emotion decay over time (see Figure 1): ∆fearDecay i(t) def = − emDecayCoeff i × fear i(t− 1) (1) We can first notice that, if fear i(t − 1) = 0 (e.g. at the simulation initial step) then ∆fearDecay i(t) = 0 and then, fear i(t) (the emotion level at time t) will not be modified by (1). So it does not trigger any emotion, but only de- creases its value with time. Moreover, the more emDecayCoeff i is great, the more emotional level decreases quickly. Finally, note that the emotion decay has the same shape as the “activation level decreasing” in the Anderson’s the- ory of central cognition [3]. It could certainly be oversubtle but this form has the advantage to be computationally in- teresting. In Figure 1, the fear function is limited to the fear de- cay effect (what we call fearDecay i(t)), so its evolution is described by fearDecay i(t) = fear i(t− 1) + ∆fearDecay i(t). Figure 1: Fear decay with emDecayCoeff i = 0.02 for any agent i and without any other stimulus. 2.3 Environment Influence on Emotion The environment contains dangers (fires for instance), warnings (alarm...) or other elements (smoke...) that may have an impact on emotions. In particular, dangers may trigger a fear emotion or increase the fear intensity. In the following, we consider two distinct processes: a) emotion is triggered when the agent does not feel fear yet and b) the fear level is updated when an agent is already feeling fear and has to face a hazard. Emotion triggering (when fearDecay i(t) = 0). When agent i does not feel fear at time t just after the emotion decay computation (fearDecay i(t) = 0) and perceives a hazard or hears an alarm, this stimulus appraisal will trigger an emotion. We make the assumption that both the distance to the danger and the number of dangerous elements the agent has perceived influence the intensity of the triggered emotion. The fear degree function should be an increasing func- tion of the number of hazards, but a logarithm-like function to capture the fact that the difference in terms of intensity is greater if the agent observes a small number of fires (for instance, 2 fires instead of 1) rather than if it observes a huge number (for instance, 102 fires instead of 101). In addition, we consider that the intensity should also be a de- creasing function of the distance to hazard and we assume that the relevant distance minDistFiresi(t) at time t from agent i to hazards is here the distance to the closest hazard and not the average distance to all fires in i’s neighborhood (see Section 2.1). As a consequence, emotion triggering when fires occur in the perception radius visualRadiusi(t) of agent i at time t is formalized as follows. When fearDecay i(t) = 0, we define the intensity of the triggered fear by: ∆fearEnv i(t) def = 1 1 + e −λi(1− minDistFiresi(t) visualRadiusi(t) ) (2) Clearly, ∆fearEnv i(t) is a sigmoid function where λi characterizes the steepness of the curve. λi should increase Emotional Contagion Model for Group. . . Informatica 41 (2017) 169–182 173 together with the number of fires in the i’s perception area at time t (that is, formally, card(detectedFiresi(t))) and it also depends on the fire influence on agent i (fireInflCoeff i). So: λi def = fireInflCoeff i ×( 1− 1 card(detectedFiresi(t)) + 1 ) (3) Note that fireInflCoeff i could depend on the knowledge about and the experience with fire of i [23]. Figure 2 illustrates the impact of the number of fires and of their distance on the initial fear level. Note that (2) ensures that ∆fearEnv i(t) ∈ [0, 1]. We have chosen here a sigmoid function because this type of function illustrates perfectly the switch between a low level of the fear intensity3 and the triggering of fear. We use here a particular steepness λi that must be easily changed, depending of the experimental situation. In Figure 24, fear at time t is computed only from the en- vironment influence (neither emotion decay is applied nor neighbors influence). It is supposed here that the more time increases, the more fires number decreases. Sev- eral simulations have been executed, corresponding to sev- eral minimal distances between agent i and fires (that is: minDistFiresi(t) ∈ {0.0, 5.0, 10.0, · · · , 40.0}). So, its evolution is described by fearEnv i(t) = ∆fearEnv i(t). Note that the more minDistFiresi(t) is low, the more the intensity of fear is high when the number of fires is maxi- mal. Emotion update (when fearDecay i(t) > 0). When fearDecay i(t) > 0, fear has already been triggered and we assume that the perception of fires must change this previ- ous fear level. So, we use the derivative (4) of the previous sigmoid described in (2) to update step by step the emotion level. For convenience’ sake, let be λ′i def = λi × ( 1− minDistFiresi(t) visualRadiusi(t) ) . So, ∆fearEnv i(t) is just the variation of fear following from the environment influence on the emotion level at time t. That is: ∆fearEnv i(t) def = fearDecay i(t).(1− fearDecay i(t)).λ′i (4) when 0 < fearDecay i(t) < 1 3By low level, we means a level that is under the triggering threshold of fear. 4The numerical values chosen in this section have been chosen with a case study of the size of a supermarket in mind. For the other coefficients, they have been chosen in order that results to be good illustration of the equations. The exploration of the various values of parameters is provided in the Section 4 ∆fearEnv i(t) is here the variation of i’s fear level at time t after the influence of the environment on the emotion level (without taking into account the emotion decay). Figure 3 presents the evolution of the fear level under the single influence of the environment (fire).The fear evo- lution is thus described by the equation: fearEnv i(t) = fearDecay i(t) + ∆fearEnv i(t). 2.4 The Neighbors’ Emotional Contagion The two previous subsections focused on the individual part of the emotion. We consider here its social aspect: emotions can spread among neighbors. This has already been investigated in many works, such as [13, 5] where the emotion of an agent tends to the average value of all the agents over time (as in our model). In our model, an agent detects its neighbors at time t based on its visual radius (see Ni(t) in Section 2.1). So, the emotional influence of agent j on agent i at time t is the difference between the emotion level of i and the emo- tion level of j at time t. This influence is weighted by the influence coefficient agtInflCoeff i(j) of j on i. So, for- mally: InfluenceOf j i(t) def = ( fear j(t− 1)− fearEnv i(t) ) × agtInflCoeff i(j) (5) agtInflCoeff i(j) depends on the relationship between i and j: stronger theses relationships are, higher this value is. This equation is based on the bounded confidence model of [18]. Some equations have been proposed in the social network analysis area (see [7, 20, 19, 16, 30] for instance) corresponding to the modelling of different situations. Note that if fear j(t − 1) > fearEnv i(t) then InfluenceOf j i(t) > 0: it means that the fear level of i will increase. Conversely, if fear j(t− 1) < fearEnv i(t) the i’s fear level will decrease. If the levels are the same, it means that i is not influenced by j (InfluenceOf j i(t) = 0). So, we are now able to compute the influence of all the i’ neighbors that is the average value of all the individual influences: ∆fearNeighbi(t) def = 1 card(Ni(t)) ∑ j∈Ni(t) InfluenceOf j i(t) (6) Note that the influence of neighbors is computed as the av- erage value of each neighbor. Without the decay and without the environment influ- ence, the emotion of all simulated agents reaches average values as illustrated in Figure 4. It corresponds to the fol- lowing equation: fearNeighbi(t) = fearEnv i(t) + ∆fearNeighbi(t) Depending on agtInflCoeff i(j) for every neighbor j of i, the time to reach this equilibrium can be different. 174 Informatica 41 (2017) 169–182 X.-H. Ta et al. Figure 2: Fire number and distance impact on the emotion level (with visualRadiusi = 40,fireInflCoeff i = 1). Figure 3: Emotional level dynamics only influenced by environment (emDecayCoeff i = 0) with (for every i ∈ AGT and t ∈ TIME ): fireInflCoeff i = 0.1, card(detectedFiresi(t)) = 2, minDistFiresi(t) = 10, visualRadiusi(t) = 40 and with fear i(t0) = 0.05. 2.5 The Emotion Level Global Equation The new emotion level of agent i at time t, after the decay due to time (see Section 2.2), the influence of the environ- ment (see Section 2.3), and the influence of i’s neighbors (see Section 2.4) is nothing else that: fear it = fearNeighbi(t) (7) (It is due to the fact that we have chosen to compute fear at time t as a composition of functions.) 2.6 Additional Influences of the Environment on Emotion Some other factors may impact agents’ emotions in differ- ent manners. For instance, the influence of smoke is similar Figure 4: Fear level dynamics of every agent i un- der the only influence of emotion contagion process (emDecayCoeff i = 0 and fireInflCoeff i = 0), with agtInflCoeff i(j) = 0.02 (for every neighbor j of i), card(AGT ) = 10. The inital fear value is chosen ran- domly in [0, 1]. Emotional Contagion Model for Group. . . Informatica 41 (2017) 169–182 175 Figure 5: Emotion evolution of all the agents under the only effect of emotional contagion. to the fire one but the impact coefficient can be different. The influence of alarm does not depend on the distance as we could suppose that all people could hear the alarm. Finally, we can also mention as additional factors influ- encing agents’ emotions: the fear reduction due to a secu- rity agent, the impact of the perception of an exit door, or the impact of the help received from others. 3 Experiments on the emotion dynamics In this section, we assess the impact of various possible combinations of the three factors (emotion decay, conta- gion and environment) on the emotion dynamics. We first only investigate the emotion dynamics and then couple it with a second dynamics: agents’ moves. (Note that in the following, i’s visual radius does not change over time and we note it: visualRadiusi.) 3.1 Emotion Dynamics with Unmoving Agents The following results are computed with card(AGT ) = 20 and card(FIRE ) = 10 and with the follow- ing values of agent parameters (for every agent i ∈ AGT ): emDecayCoeff i = 0.02, fireInflCoeff i = 0.1, agtInflCoeff i(j) = 0.04 for every j ∈ Ni(t) and every t ∈ TIME , and visualRadiusi = 40. Neither the agents nor the fires move. 3.1.1 Emotional Contagion In these simulations, we first check the impact of the ran- dom distribution of agents in the environment on the con- tagion. As they have a limited perception radius, agents are not able to diffuse their emotion to all other agents. We initialize agents’ fear level to a random value in [0, 1]. The result is presented in Figure 5. We observe that the agents’ emotion tends towards a limited number of values. Each Figure 6: Emotion evolution of all agents under both the decay and the contagion effects. of these values correspond to a spatially clustered set of agents. This convergence state with several stable values be- comes quite common in the related field of social opin- ion dynamics. In particular, [8] has proposed the bounded confidence model that uses continuous opinion value and an acceptability threshold. When two agents (represent- ing individuals moving in an abstract environment) meet each other they share their opinions. If they are not too far (distance in terms of opinion below a given threshold), opinions are altered in order to come closer. Depending on parameters (interaction frequency, initial opinion distribu- tion, or even interaction network topology), various kinds of convergence can appear: either convergence to an inter- mediate consensus or to one or two extremist opinions. In our case, we recognize basically the same pattern, the ac- ceptability threshold of [8] is for us the perception radius that will limit the agents that can interact together. 3.1.2 Coupling Emotion Decay and Contagion As we do not take into account the process triggering emotions from environment stimuli, we initialize randomly fear i(t0) ∈ [0, 1] for every agent i ∈ AGT and test the in- fluence of the two decay and contagion factors. The result is presented in Figure 6. With no influence of fires, the fear level of each agent i converges (due to the emotional contagion) and tends towards 0 (due to the decay). Nevertheless we can notice that even without stim- ulus, the fear level of some agents starts increasing due to the contagion dynamics before finally decreasing when the decay becomes the dynamics that have the greatest influ- ence on the system. 3.1.3 Coupling Emotion Decay and Environment Let be fear i(t0) = 0 for every i ∈ AGT . The emotion will be triggered by the perception of fires. The result is presented in Figure 7. We first observe that fear level of some agents keep or tend towards 0, because they can not perceive any fire. The main observation is that fear i(t) 176 Informatica 41 (2017) 169–182 X.-H. Ta et al. Figure 7: Emotion evolution of all the agents under both the decay and the environment effects. Figure 8: Emotion evolution of all agents under both the environment and the contagion effects. reaches a stable value for each agent i when t increases. This value depends on the number of fires and the distance to them. This shows that the simulation reaches an equi- librium between the two processes influencing the emotion dynamics. In addition the stable value is always smaller than the maximum value due to the effect of the decay. 3.1.4 Coupling Environment and Emotion Contagion Again we conside the situation where fear i(t0) = 0 for ev- ery i ∈ AGT and the emotion will be triggered by fires in the environment. We consider in this case the coupling be- tween the emotion triggered by fire and the emotion conta- gion among agents. The result is plotted in Figure 8. With- out emotion decay, agents fear tends to reach the maximal value (i.e. 1). Time to reach it depends on the distance to fires and the number of neighbours. Nevertheless we can again observe a stability of the results. In addition, due to emotional contagion over agents, no agent has its fear level staying at the value 0. Even agents that cannot perceive the danger start to feel fear because of their neighbors. Figure 9: Emotion evolution of all the agents under the decay, the environment and the contagion effects. Figure 10: Impact of all the factors (decay, environment, contagion) on the emotion intensity in case of moving agents. 3.1.5 Coupling Emotion Decay, Environment and Emotion Contagion Finally we couple the three processes in a single model. Figure 9 displays the results. The results show again that fear levels tend to a stable value. This value is obviously lower than the value obtained without decay (see Figure 8). But it is interesting to note that the fear level values are also lower than the ones in the case without contagion (see Figure 7). The contagion process indeed drives fear level values to the average value which induces a decrease of the maximum value. 3.2 Emotion Dynamics with Moving Agents The previous results come from simulations with static agents and environment, providing, as expected, stable re- sults. In this section we will introduce agents mobility. We launch the simulations in the same conditions as the pre- vious ones, except that we have 10 agents. Agents move randomly in the environment: they pick a random target in the environment, move to it and when they reached it they choose a new one. Figure 10 displays each agent emotion evolution. We can observe that the results are not stable anymore. Emotional Contagion Model for Group. . . Informatica 41 (2017) 169–182 177 Figure 11: Impact of only the emotion contagion on the emotion intensity in case of moving agents. Indeed as the agents can move they will be sometimes close to fires, increasing their level fear, and sometimes far from them, decreasing their fear level. If we activate only the emotional contagion, we observe in the Figure 11 with moving agents that each agent fear level converges toward the same value. Contrarily to the results in Figure 5, we can observe here a convergence hav- ing moving agents removes the cluster effect that can occur when agents do not move. 4 Sensitivity analysis In this section, we explore the model behavior with respect to parameters variations. We only focus here on the three following coefficients for a given agent i: emDecayCoeff i, fireInflCoeff i and agtInflCoeff i, that characterize the three processes making emotion dynamic during the simu- lation. So, we will measure the maximum, minimum, aver- age and standard deviation values of the agents’ fear level at the end of the simulations. In addition we will com- pare results between two cases: with and without moving agents. We initialize simulations with card(AGT ) = 50, card(FIRE ) = 10, randomly located. For each parame- ters tuple 〈emDecayCoeff i,fireInflCoeff i, agtInflCoeff i〉 (where i ∈ AGT ) we run 10 simulations and measure the maximum, the minimum, the average and standard devia- tion values of the agent fear level at the step number 100. When agents can move, they choose a random target, go to it and when reached the target it picks randomly a new target. Figure 12: Impact of emDecayCoeff i (for every i ∈ AGT ) on the fear level of moving agents in case of fireInflCoeff i = 0.05 and agtInflCoeff i = 0.01. 4.1 Exploration in the Case of Moving Agents 4.1.1 Exploration of the Impact of the Decay Coefficient emDecayCoeff i For every agent i ∈ AGT , let fireInflCoeff i = 0.05, agtInflCoeff i = 0.01 and emDecayCoeff i ∈ {0.01, 0.02, 0.03, 0.04, 0.06}. We measure the 4 indicators presented above and denoted them max, min, mean and standard deviation. We observe the results in Figure 12. We can observe that when emDecayCoeff i increases, the fear level tends toward 0. This means that when the decay coefficient is more important, the decay process has more influence on the simulation results. 4.1.2 Exploration of the Impact of all the Parameters The previous Section 4.1.1 shows the impact of the emDecayCoeff i parameter single-variation on the fear level. We launch now an exhaustive exploration of the model with (for every agent i ∈ AGT ): – emDecayCoeff i ∈ {0.01, 0.02, 0.03, 0.04, 0.06} – fireInflCoeff i ∈ {0.05, 0.1, 0.2, 0.3, 0.5} – agtInflCoeff i ∈ {0.01, 0.06, 0.1, 0.2, 0.3} For each parameter tuple 〈emDecayCoeff i,fireInflCoeff i, agtInflCoeff i〉 we launched 10 simulations and store the average value of each indicator. The complete results are summarized in Figure 13 and Figure 14. These figures display the scatter plots of all possible pairs of parameters and indicators. For example in Fig- ure 13, the upper-right frame plots the max indicator with relation to the emDecayCoeff i parameter 5. All the bullets 5This has been plotted using the R software: https://www. r-project.org/ 178 Informatica 41 (2017) 169–182 X.-H. Ta et al. Figure 13: For every agent i ∈ AGT , max indicator depending on emDecayCoeff i, fireInflCoeff i and agtInflCoeff i values. Figure 14: max, min, mean and standard deviation values depending on emDecayCoeff i, fireInflCoeff i and agtInflCoeff i values for every agent i ∈ AGT . Emotional Contagion Model for Group. . . Informatica 41 (2017) 169–182 179 Figure 15: Impact of emDecayCoeff i on the fear level of unmoving agents when fireInflCoeff i = 0.05 and agtInflCoeff i = 0.01 (for every agent i ∈ AGT ). correspond to the projection of tuples 〈emDecayCoeff i,fireInflCoeff i, agtInflCoeff i,max〉 (for every i ∈ AGT ) in a 2 dimensions plan. This repre- sentation allows the modeler to isolate the influence of one single parameter evolution on one single indicator. In addition, still looking at the upper-right frame, we can observe possible values of the emDecayCoeff i parameters on the right and the value range of the max indicator on the top. We can thus observe that (for every i ∈ AGT ) fireInflCoeff i has a huge influence on the max indicator: when fireInflCoeff i is high (0.5) the maximum fear lev- els are also very high (between 0.7 and 1). And this re- sult is independent from the other parameter values. When fireInflCoeff i is low (0.01 and 0.02) the maximum is lower and close to 0. Similarly we can observe that the emDecayCoeff i pa- rameters have an effect on the boundaries of the max in- dicator: for every i ∈ AGT , when emDecayCoeff i is high, the maximum of the max indicator is limited to 0.8 whereas, with the lowest value of this coefficient, the limit is around 1, and many plots are concentrated around this value. We can notice that for intermediate values of the emDecayCoeff i coefficient, plots are concentrated around 0.0 and 0.8. We thus have a polarization of the results around two main values, corresponding to the minimum and maximum values that the max can take. We can also observe that agtInflCoeff i does not have a visible impact on the max indicator: with high or low values of this coefficient, the max indicator takes values everywhere in [0, 1]. Looking at Figure 14, we can also notice that fireInflCoeff i has a smaller influence on the min indica- tor, but emDecayCoeff i has a higher one. In particular, when emDecayCoeff i increases the min indicator takes lower values. It is also interesting to notice that, when we consider emDecayCoeff i, the distributions of min and mean plots are very close, whereas when we consider fireInflCoeff i, max and mean plot distributions are close (and different from the min distribution). This means that, in average, plots are closer the min (resp. the min) plot distribution. Finally we can observe that, even if the agtInflCoeff i does not have a significant influence on the max and mean indicators, it tends to reduce the standard deviation. That means that the emotional contagion tends to level fear level values. 4.2 Exploration in the Case of Unmoving Agents We run simulations with the same initial conditions as in the previous section but agents don’t move now. The re- sults are quite similar to the results in case of moving agent (Figure 15). This is due to the high number of agents and the chosen visual radius (visualRadiusi = 40 for every i ∈ AGT ). We continue to expand this experiment by changing emDecayCoeff i, fireInflCoeff i and agtInflCoeff i(j) (for every agent i and every j ∈ Ni(t)). The comparison is presented in Figure 16(a), Figure 16(b), Figure 16(c) and Figure 16(d). We can observe that there is only a small difference in the emotion level values between both cases. It seems that the emotion of agent in these cases do not depend on moving or unmoving agents. It can be explained by the higher value of the visual ra- dius: an agent can detect more agents, so it will be influ- enced by more of them. Evidently, an agent moving has more opportunity to detect the others. But with a large vi- sual radius, there is not much difference between 2 types of agent. And one thing important, we don’t account into the influence of neighbours, therefore the distance between agents when they move, does not play an important role. Nevertheless we go a little deeper in the comparison be- tween simulations with moving and unmoving agents. We aim at evaluating the time for fear levels to converge under the influence of the emotional contagion process only and the influence of agtInflCoeff i(j) (for every j ∈ Ni(t)) on the convergence. We run simulations and stop them when the standard de- viation indicator becomes lower than 0.01. We count the number of simulation steps necessary to reach this state. The results are shown in Figure 17. We can observe that the number of steps to reach the equilibrium is higher for unmoving agents case than for moving agents one: moving agents tend to meet more other agents and this mix fasten the emotion convergence. This mix has a huge impact when agtInflCoeff i(j) (for every j ∈ Ni(t)) is low, but decrease when the parameter value increases. 180 Informatica 41 (2017) 169–182 X.-H. Ta et al. (a) Unmoving agents in case of changing emDecayCoeff i while fireInflCoeff i = 0.1 and agtInflCoeff i(j) = 0.01 for every j ∈ Ni(t). (b) Unmoving agents in case of changing emDecayCoeff i while fireInflCoeff i = 0.1 and agtInflCoeff i(j) = 0.08 for every j ∈ Ni(t). (c) Moving agents in case of changing emDecayCoeff i while fireInflCoeff i = 0.1 and agtInflCoeff i(j) = 0.01 for every j ∈ Ni(t). (d) Moving agents in case of changing emDecayCoeff i while fireInflCoeff i = 0.1 and agtInflCoeff i(j) = 0.08 for every j ∈ Ni(t). Figure 16: Comparing moving and unmoving agent in case of changing 3 factors emDecayCoeff i, fireInflCoeff i and agtInflCoeff i(j) Figure 17: Relationship between agtInflCoeff i(j) (for ev- ery j ∈ Ni(t)) and time in the case where all the agents reach to the equivalent emotion. 5 Conclusion and future works In this article we proposed a model of fear level dynam- ics based on some main findings from social psychology. Our aim here is to provide an intuitive formalization of the computational process for emotion modeling. The model was implemented in GAMA agent-based simulation platform. We conducted an intensive experi- ments to find the equivalent value of three coefficients that have impact on the emotion intensity of agent group. We presented our results about the impact of decay, environ- ment, and agents neighbors factors (i.e. emotional conta- gion) on emotion intensity. We shown how emotion evolves over time and the role played by each variable of the simulation by using several scenarios. In particular, the impact of the environment (in case of the fire perception) has a great influence on the max- imum fear level, whereas the emotional contagion tends to bring closer emotions in the agent population. Although this paper context is about crisis situation and evacuation, the study remains abstract: the purpose of this article is mainly to focus on the emotion dynamics model and its exploration. The next step will be to integrate this emotional frame- work into a simulation of evacuation in crisis situation. Emotions will be used at several steps: physical properties of agents (strong emotions can make people move faster or slower), decision-making process (it is now established that emotions help to make decisions and often fasten the decision-making process with a risk of making less effi- Emotional Contagion Model for Group. . . Informatica 41 (2017) 169–182 181 cient decisions), and social process (in particular the group constitution and the effects of the group on the group mem- bers). The main objective will be to provide more realis- tic evacuation simulations, in terms of human behaviors, and thus to reach decision-support systems to support cri- sis managers. We thus attempt to make simulations more realistic by improving the human agents behaviors (in line with [1, 6]). More particularly, two application cases can be very in- teresting. First it could help architects and urban planners to better design public spaces to help people to better evac- uate taking into account cognitive attitudes such as emo- tions or social binds and not only simple physical flow of individuals. Second we plan to apply this framework on the cases study of Australian bushfires simulations [2]. This case of bushfires has killed hundreds of people and has been deeply studied, in particular through interview of most of the survivors. An important conclusion of this sur- vey was that civilians have not reacted and acted as ex- pected by authorities in charge of the preparedness against fires and rescue to victims. First models of the evacua- tion has been implemented, with a focus on the distinction between objective and subjective civilian capabilities and perception of the environment. We argue that it could be improved by introducing emotional capabilities that can in- fluence these biases in the representation of the world. 6 Acknowledgments This work is funded by the research project at Vietnam Na- tional University in Hanoi, number QG.15.31, on the mod- eling and simulation of fire evacuation in public buildings. We also acknowledge the referees of SoICT’2016 for their constructive remarks. References [1] C. Adam and B. Gaudou. BDI agents in social sim- ulations: a survey. Knowledge Engineering Review (KER), 31(3):207–238, 2016. [2] C. Adam and B. Gaudou. Modelling Human Be- haviours in Disasters from Interviews: application to Melbourne bushfires. Journal of Artificial Societies and Social Simulation (JASSS), (To appear), 2017. [3] J. R. Anderson and C. Lebiere. The Atomic Com- ponents of Thought. Lawrence Erlbaum Associates, Mahwah, NJ, 1998. [4] T. Bosse, R. Duell, Z. A. Memon, J. Treur, and C. N. Van Der Wal. Multi-agent model for mutual absorp- tion of emotions. ECMS, 2009:212–218, 2009. [5] T. Bosse, R. Duell, Z. A. Memon, J. Treur, and C. N. van der Wal. Agent-based modeling of emotion con- tagion in groups. Cognitive Computation, 7(1):111– 136, 2015. [6] P. Caillou, B. Gaudou, A. Grignard, Q. C. Truong, P. Taillandier. A Simple-to-Use BDI Architecture for Agent-Based Modeling and Simulation. ESSA, 15-28, 2015. [7] M. de Groot. Reaching a consensus. Journal of the American Statistical Association, 69(345):118— -121, 1974. [8] G. Deffuant, D. Neau, F. Amblard, and G. Weisbuch. Mixing beliefs among interacting agents. Advances in Complex Systems, 03(01–04):87–98, 2000. [9] J. Drury and C. Cocking. The mass psychology of disasters and emergency evacuations: A research re- port and implications for practice. Research report, University of Sussex, 2007. [10] J. Drury, C. Cocking, and S. Reicher. Everyone for themselves? a comparative study of crowd solidarity among emergency survivors. British Journal of Social Psychology, 48:487–506, 2009. [11] J. Drury, C. Cocking, and S. Reicher. The nature of collective resilience: Survivor reactions to the 2005 london bombings. International Journal of Mass Emergencies and Disasters, 27(1):66–95, 2009. [12] J. Elster. Alchemies of the Mind: Rationality and the Emotions. Cambridge University Press, 1999. [13] L. Fu, W. Song, W. Lv, and S. Lo. Simulation of emo- tional contagion using modified sir model: A cellular automaton approach. Physica A: Statistical Mechan- ics and its Applications, 405:380–391, 2014. [14] P. Gantt and R. Gantt. Disaster psychology dispelling the myths of panic. Emergency Planning, 2012. [15] U. Grandi, E. Lorini, A. Novaro, and L. Perrus- sel. Strategic disclosure of opinions on a social net- work. In Proceedings of the 16th International Con- ference on Autonomous Agents and Multiagent Sys- tems (AAMAS-2017), 2017. [16] M. Granovetter. Threshold models of collective be- havior. American Journal of Sociology, 83(6):1420— -1443, 1978. [17] A. Grignard, P. Taillandier, B. Gaudou, D. A. Vo, N. Q. Huynh, and A. Drogoul. Gama 1.6: Advancing the art of complex agent-based modeling and simu- lation. In PRIMA 2013: Principles and Practice of MAS, pages 117–131. Springer, 2013. [18] R. Hegselmann and U. Krause. Opinion dynamics and bounded confidence models, analysis, and simu- lations. Journal of Artificial Societies and Social Sim- ulation, 5(3), 2002. 182 Informatica 41 (2017) 169–182 X.-H. Ta et al. [19] D. Kempe, J. Kleinberg, and E. Tardos. Maximiz- ing the spread of influence through a social network. In Proceedings of the Ninth ACM SIGKDD Interna- tional Conference on Knowledge Discovery and Data Mining, 2003. [20] D. Kempe, J. Kleinberg, and E. Tardos. Influen- tial nodes in a diffusion model for social networks. In Proceedings of the 32nd International Colloquium on Automata, Languages and Programming (ICALP- 2005), 2005. [21] R. S. Lazarus. Emotion and Adaptation. Oxford Uni- versity Press, 1991. [22] V. M. Le, C. Adam, R. Canal, B. Gaudou, T. V. Ho, and P. Taillandier. Simulation of the emotion dynam- ics in a group of agents in an evacuation situation. In N. Desai, A. Liu, and M. Winikoff, editors, Principles and Practice of MAS, volume 7057 of LNCS, pages 604–619. Springer, 2012. [23] J. Leach. Why people ’freeze’ in an emer- gency: Temporal and cognitive constraints on sur- vival responses. Aviation, space, and environmental medicine, 75(6):539–542, 2004. [24] H. Mark, T. Jan, v. d. W. C. Natalie, and v. W. Arlette. Modelling the interplay of emotions, beliefs and in- tentions within collective decision making based on insights from social neuroscience. In International Conference on Neural Information Processing, pages 196–206. Springer, 2010. [25] V. T. Nguyen, D. Longin, T. V. Ho, and B. Gau- dou. Integration of emotion in evacuation simulation. In C. Hanachi, F. Bénaben, and F. Charoy, editors, Information Systems for Crisis Response and Man- agement in Mediterranean Countries, volume 196 of Lecture Notes in Business Information Processing, pages 192–205. Springer, 2014. [26] J. Norris R. Panic and the breakdown of social or- der: Popular myth, social theory, empirical evidence. University of Cincinnati, pages 171–183, 1987. [27] A. Ortony, G. Clore, and A. Collins. The cognitive structure of emotions. Cambridge University Press, Cambridge, MA, 1988. [28] E. Quarantelli. The sociology of panic. In Smelser and Baltes, editors, International Encyclopedia of the Social and Behavioural Sciences, pages 11020– 11023. Pergamon Press, New York, 2001. [29] E. Quarantelli. The nature and condition of panic. American Journal of Sociology, pages 267–275, 2010. [30] T. Schelling. Micromotives and macrobehavior. Nor- ton, 1978. [31] K. R. Scherer, A. Schorr, and T. Johnstone, editors. Appraisal Processes in Emotion : Theory, Methods, Research. Oxford University Press, 2001. [32] X. H. Ta, D. Longin, B. Gaudou, and T. V. Ho. Impact of group on the evacuation process: theory and simu- lation. In Proceedings of the Sixth International Sym- posium on Information and Communication Technol- ogy, pages 350–357. ACM, 2015. [33] P. Taillandier, A. Grignard, B. Gaudou, and A. Dro- goul. Des données géographiques à la simulation à base d’agents : application de la plate-forme gama. European Journal of Geography, 671:online, 2014. Informatica 41 (2017) 183–192 183 Key-Value-Links: A New Data Model for Developing Efficient RDMA-Based In-Memory Stores Hai Duc Nguyen, The De Vu, Duc Hieu Nguyen, Minh Duc Le, Tien Hai Ho and Tran Vu Pham Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, Vietnam E-mail: ptvu@hcmut.edu.vn Keywords: in-memory stores, key-value, RDMA, InfiniBand Received: March 24, 2017 This paper proposes a new data model, named Key-Value-Links (KVL), to help in-memory store utilizes RDMA efficiently. The KVL data model is essentially a key-value model with several extensions. This model organizes data as a network of items in which items are connected to each other through links. Each link is a pointer to the address of linked item and is embedded into the item establishing this link. Organiz- ing datasets using the KVL model enables applications making use RDMA-Reads to directly fetch items at the server at very high speed. Since link chasing bypasses the CPU at the server side, this operation allows the client to read items at extremely low latency and reduces much workload at data nodes. Further- more, our model well fits many real-life applications ranging from graph exploration and map matching to dynamic web page creation. We also developed an in-memory store utilizing the KVL model named KELI. The results of experiments on real-life workload indicate that KELI, without being applied much optimization, easily outperform Memcached, a popular in-memory key-value store, in many cases. Povzetek: Predlagan je nov podatkovni model, imenovan Key-Value-Links (povezave ključnih vrednosti). 1 Introduction In-memory stores have flourished in recent years owing to the urgent needs of fast processing and decreasing DRAM prices. Many system designers have either used main mem- ory as a primary data store [17] or as a cache to reduce the latency of accessing hot or latency-sensitive items [2]. Be- ing moved to main memory enables data to be accessed at very low latency because it removes the overhead of disk and flash. But it does not mean I/O overhead is absolutely eliminated. Because of DRAM’s low capacity, in-memory stores often deploys across multiple data nodes making net- work I/O become a potential source of overhead. Indeed, the traditional TCP/IP networks have shown many disad- vantages in supporting fast data transmission. For exam- ple, MemC3, a state-of-the-art in-memory store, runs seven times better on a single machine than in a client-server setup using TCP/IP [9, 8]. To solve this problem, several data centers started look- ing for alternative solutions. Among those, Remote Di- rect Memory Access (RDMA) is appeared to be the most promising candidate. RDMA allows applications to di- rectly read from and write to remote memory without in- volving the Operating System at any host. This ability helps RDMA achieve low latency and high throughput data transmission because it bypasses the overhead of complex protocol stacks, avoids buffer copying, and reduces CPU overhead. Despite attractive features, RDMA has not been widely used in data centers due to the high prices of its supporting NICs. In recent years, however, the prices of RDMA-enabled NICs have dramatically dropped and be- come compatible with that of traditional Ethernet NICs. For examples, A 40 Gbps InfiniBand RDMA-capable NIC costs around $500, while the prices of a 10GB Ethernet NICs may be up to $800 [15]. New standards such as iWARP and RoCE also support RDMA allowing data cen- ters to utilize RDMA with reasonable cost. Within this trend, there are many studies have started to leverage RDMA technology to build ultra-low latency in- memory store. Those works indicate that much of effort have to be spent in order to maximize the benefits of using this technology for in-memory systems. Works to be done including reduce NIC’s cache miss rate [8], minimizing the number of RDMA operations per requests [15, 8], and op- timizing hash table organization [8, 15, 12], etc. In spite of implementation differences, most of existing RDMA-based in-memory stores are constructed according to key-value model since this model is very simple and well fit large and unstructured datasets. The key-value model, however, has its own drawbacks. The most noticeable one is performance. Traditionally, ev- ery put and get operation involves to hash table lookup to identify the existence of items. This makes the hash table become the hotspot of data access and it is not surprising that most of the key-value stores spend much of effort tun- ing their hashing mechanism [15, 8, 9]. Real-life workloads indicate that key-value items are typically small [3] so em- ploying key-value model could be easily suffered from low 184 Informatica 41 (2017) 183–192 H.D. Nguyen et al. network utilization. Furthermore, using key-value model often has applications to divide its requests into multiple small item lookups. Those lookups often have to be exe- cuted sequentially due to data dependency. This causes the hash table lookup overhead and low bandwidth utilization of multiple lookups to accumulate and reasonably prolong the latency of the original request. According to [17], Face- book creates about 130 internal requests in average for gen- erating the HTML for a page. Similarly, Amazon requires about 100-200 requests to create HTML part for each page [7]. With those workloads, in-memory stores have to react very quickly to each request to guarantee desired perfor- mance. In the future, as the amount of data and workload keeps increasing rapidly, it is difficult for the in-memory stores to maintain its performance without changing re- quest processing mechanisms. In this paper, we introduce a novel data model named Key-Value-Links (KVL) to enable in-memory stores to ex- ploit RDMA efficiently to deliver ultra-low latency data services. Essentially, the KVL model is a variant of the Key-Value model that maintains links between items to ex- ploit the data dependency between them to accelerate data retrieval. A link contains the information about the (phys- ical) location and the size of referred item so applications could utilize RDMA Reads to directly fetch the item with- out invoking expensive item lookups. This design bypasses the hash table and efficiently utilizes the network as we need only one RDMA Read for reading an item. As a re- sult, getting desired items by chasing their links reduces the cumulative latency of processing multiple item requests significantly. We also introduce KELI (KEy-value-with- Links In-memory), an in-memory cache that employing KVL data model, and compare it with an in-memory key- value cache (Memcached) to reveal the performance ben- efits of utilizing the KVL model over RDMA-capable net- works. The following section briefly introduces RDMA tech- nologies and recent works in developing in-memory stores using RDMA. Section 3 discusses the KVL model in de- tail. Several classes of applications which could utilize the model efficiently are listed in Section 4. The section 5 dis- cusses the design of KELI. We conduct several experiments on real-life data to evaluate the efficiency of using KELI and report their results in Section 6. Finally, we conclude the paper in Section 7. 2 Background and related work 2.1 Remote direct memory access Remote Direct Memory Access (RDMA) allows remote computers to directly read memory regions on local mem- ory without interfering its CPU. This allows zero-copy data transfers and saving computing resource. Further- more, RDMA-enabled NICs provide kernel bypass for all communications and reliable delivery to applications. These make the typical latency of interconnects support- ing RDMA such as InfiniBand, RoCE and iWARP about 10x faster than traditional Ethernet [12]. RDMA-enabled NICs is originally designed for High-performance Comput- ing centers but due to decreasing in hardware prices, their presence in data centers is increasing [15]. The introduc- tion of RoCE and iWARP, which lets RDMA to be per- formed over traditional network architecture, even makes RDMA more popular. Applications utilize RDMA-enabled NICs through Verb API. There are several types of verbs but the most com- mon are RDMA Read, RDMA Write, Send and Receive. These verbs could be grouped into two types of seman- tics: channel semantics and memory semantics. Send and Receive have channel semantics: to send a message, the sender posts a Send description to put the message content to a remote memory location specified by a pre-posted Re- ceive description at the receiver side. Send and Receive are two-sided verbs as the communication involves the CPU of both end points. RDMA Read and RDMA Write have memory semantics: they operate directly upon the remote memory regions. Both of them are one-sided as the remote CPU does not aware those operations. This reduces not only the overhead of RDMA operations but also the load of remote CPU. Therefore, utilizing one-sided RDMA verbs could achieve very low latency and high throughput. 2.2 In-Memory stores using RDMA Attractive features of RDMA verbs have exposed many studies of utilizing RDMA technology to build high- performance in-memory stores. As communication is the major source of overhead, previous work tries to replace traditional data transfer techniques by RDMA operations to effectively reduce the overall latency. For examples, Jithin Jose et al. [11] improves Memcached performance by the factor of four by just making it RDMA capable. In the later work [10], the research group uses a hybrid approach that utilizes both Reliable Connection (RC) and Unreliable Connection (UC) transport and transparently switches be- tween them to further improve the performance by the fac- tor of 12. Apart from communication, recent studies have started to apply multiple optimization techniques on other parts of the system to reach even better performance. HERD [12, 13], makes heavy changes ranging from reducing net- work round trips and reorganizing data distribution to op- timizing PICe transactions. It even sacrifices the reliabil- ity to maximize the performance. RDMA is also com- bined with other technologies to develop complicated in- memory stores. DrTM [23] and DrTM+R [6] are two fast in-memory transaction systems utilizing both RDMA and hardware transactional memory (HTM). Different from traditional designs, Pilaf [15], FaRM [8] and HydraDB [22] let applications process request by themselves through RDMA Reads. In these systems, the server makes data visible to clients so the client could use RDMA Reads to access hash table and items at the re- Key-Value-Links: A New Data Model for. . . Informatica 41 (2017) 183–192 185 mote server as if they are on its own local memory. This approach bypasses many sources of overhead and reduces load at data servers but there are shortcomings preventing those systems from maximizing the potential of RDMA Read. For examples, Pilaf clients have to carry out multiple RDMA Reads per request. FaRM often performs RDMA Reads to get memory blocks which are much larger than the actual size of needed item. Also, bypassing remote CPU makes it unaware about application behaviors which are useful for tracking popular items. Despite differences in implementation, all studies men- tioned above employ the key-value model. Although this model is quite simple and easy to implement, the lack of the ability to represent complex data force applications us- ing this model to generate a lot of item lookups for each data request. This disadvantage makes the key-value model sensitive to latency. This is the motivation for us to develop Key-Value-Links model to solve this issue. 3 Key-Value-Link data model 3.1 Example Before describing the Key-Value-Links (KVL) model in detail, let us first show how it “looks and feels” through an example of using this model to represent a real-life dataset and handle data requests. Suppose we have a database stor- ing the information about students, professors, and depart- ments in a university. Figure 1 illustrates how the KVL is used to organize this database. In this representation, each entity (i.e. student, professor, and department) is a key-value item. The key is unique and is used to iden- tify the item. The database has five entities: two students “stu001" and “stu002" under supervision of two profes- sors “prof001" and “prof002" working in the department ‘‘dpcs". The value of each of those items contains multi- ple attributes representing information associated with the item. The item “stu002", for instance, has three attributes in its value. The first one (e.g. “name : DEF") shows the student’s name while the other are links indicating his supervisor and mentor. Those links do not provide infor- mation about those people but instead point to items stor- ing information about them. In implementation, those links could be represented as pointers which let applications di- rectly access linked item without sending a request to the data server. Storing links to other items inside the item’s value makes it easier to reason useful information which requires us to combine the data from multiple sources. In the university database, for example, suppose that user wants to know whether two students "stu001" and "stu002" are under the supervision of professors working in the same department. To answer this question, we must first get access to the two student items using their keys. After that, we travel across the supervisor link of those items to obtain the informa- tion about the supervisors. We then use department links to go to their departments. Finally, we check if those de- supervisor stu001 name: ABY tel: 123456789 website: www.cs.ac.co depcs name: XYZ supervisor stu002 name: DEF mentor department prof001 name: UVT department prof002 name: IJK Figure 1: An example of representing a dataset from a uni- versity using the KVL data model. The key of each item is shown in bold text, the links are shown in shadow frames, and the arrows used to represent the link from one item to another. partments are the same to provide the final answer to the question. If this database is represented by the relational model, answering this question requires us to perform multiple cumbersome joins. Because such operation consumes a lot of time and resource, using a relational database in such case does not guarantee an acceptable performance. If we use traditional key-value model, the applications must de- compose the request into multiple item lookups. Since most of the lookups depend on the results of the previ- ous ones, applications have to perform them sequentially. If the number of lookups is large, the cumulative latency, which is calculated by adding up the latency of each item lookups, will become very high and reasonably hurts the overall latency of the original request. Most of the item lookups could be replaced by one link chasing if we use KVL model. As we will show in the next subsection, the former operation is much more expensive than the latter so using key-value model also takes more time to process the request than applying the KVL model. 3.2 Data model Generally, KVL is an enhanced version of the traditional key-value model. In this model, each item is a key-value pair connected to each other through links. Inside an item, the key is its identifier while the value describes its charac- teristics. There is no restriction on the size of either key or value. Different from some implementations of key-value model used in RAMCloud [17] or Memcached [2], KVL model cares about the structure of value. Particularly, the value is a set of attributes in < K,V > format where K is the name of the attributes and V is its value. The value V could be either a block of bytes representing some kind of information defined by the user or a link to another item. The concept of links is similar to that of pointers. Both of them let applications know the location of the resource but do not provide the information about its content and as- sociated data. There are several benefits of this approach. Embedded an item into another one enlarges data size sig- 186 Informatica 41 (2017) 183–192 H.D. Nguyen et al. Hash Table A Memory B C D E C E Application in-memory store Figure 2: An example implementation of KVL model based on the organization of existing key-value stores. nificantly which would reduce memory utilization as well as slow down data transmission. This also allows an item to have multiple copies which could be a nightmare for main- taining consistency. Furthermore, with the support from RDMA Reads, pointing to referred item through its address lets applications directly fetch the item without interfering the data server. Utilizing RDMA Reads helps fetching item at ultra-low latency as it bypasses many sources of over- head such as notifying remote CPU and hash table lookup. It also allows the system scale easily as the remote machine could save many CPU cycles for other tasks. Figure 2 illustrates the organization of an in-memory store implementing the KVL model based on the funda- mental structure of existing in-memory key-value stores. Basically, KVL model is also a key-value model so meth- ods it uses to handle data are similar to those of key-value. In particular, the store constructs a hash table to keep track of items stored in the system. Putting a new item to and get- ting an existing one from the store requires the key of this item to be hashed to the hash table first to determine the proper action. Clearly, operations in a key-value store are all related to the hash table making it become the hotspot of the system. The introduction of links leads to a new way to get data from in-memory stores called link chasing to reduce the load on the hash table. In this method, applications use links attached to items it has fetched previously to invoke RDMA Reads to directly retrieve the linked items from the in-memory stores without explicitly sending a get re- quest. For examples, in the Figure 2, the application has performed two lookups to load the item B and item C from the server. It has two options to load item A. It could gen- erate a get request containing the key of item A and send it to the server to have it search for this item. The another choice is to use RDMA Read to chase the link to item A which is integrated to the item C to read this item directly without asking for the server. Figure 3 compares the latency of link chasing using RDMA Read with that of item lookup on different item 32 128 512 2048 Size (Byte) 0 4 8 12 Ti m e (M ic ro se co nd ) RDMA Read HERD Figure 3: The latency of getting items with different sizes using RDMA Read and HERD. sizes. The implementation of item lookup is based on the method used in HERD [12], which is one of the fastest in- memory key-value stores in the literature. It is clear that even though being heavily optimized by many techniques, item lookup still runs much slower than RDMA Read. This means if we could organize items needed by applications in a way such that from one item we could reach to other ones by just chasing links, the latency could be reduced up to 50%. Therefore, using KVL data model with good data schema design could significantly boost the system perfor- mance without spending much effort on optimizing the in- memory store implementation. 4 Applications Apart from the simple university example in the previous section, we found that the KVL is also applicable to a wide range of applications. The followings are a few of them. 4.1 Graph exploration Graph exploration is required by many data-intensive ap- plications [16]. Graph traversal algorithms such as breadth- first search (BFS) and depth-first search (DFS) are used as basic components in various complicated algorithms which are used to solve problems in many fields including biol- ogy, communication, social network, etc. In the Big Data area, a graph could contain up to trillions of nodes and edges. Hence, traverse such large-scale graphs efficiently is critical. A graph contains only nodes and edges but in real-world applications, both nodes and edges are associated with a lot of information. This makes representing the topology of the graph in the computer a nontrivial task especially in the case of large graphs which could expand over mul- tiple data nodes. Modeling graphs using relational model or XML does not scale well. Plus, those tools do not ease graph traversals. Many state-of-the-art in-memory graph Key-Value-Links: A New Data Model for. . . Informatica 41 (2017) 183–192 187 databases are constructed upon key-value model [4, 21] due to its simplicity. However, deploying graph traversal algorithms using this model would lead to high cumulative latency. The KVL model, on the other hand, is very similar to the concept of Graph database since this model itself is a network of items. In fact, it could be considered as a “lightweight” graph in which items are vertices and links are edges. We use the term “lightweight” because there are some limitations that prevent KVL from naturally rep- resenting complicated graphs. For examples, information cannot embed into links and a link must point to a physical address rather than abstract objects. Due to those short- comings, using links to represent edges in complex graphs could increase the management costs reasonably. In spite of those, KVL is appeared to be well fit to graph traversal algorithms. With link chasing, applications could avoid a lot of overhead during visiting vertices. 4.2 Dynamic web content creation The rapid increment of the amount of data and the need of improving user experiments make the number of dynamic web pages increase at a high pace. One well-known so- lution for efficiently delivering dynamic content is to de- compose the pages into small fragments and cache those in main memory. Additionally, an object dependence graph (ODG) is constructed to keep track changes and maintain consistency [5, 19]. When a new web page is requested, the ODG is checked to reload fragments whose content has been changed and directly fetch those whose content remains unchanged from the cache. During this process, fragments are issued sequentially since the latter fragment depends on the earlier one. With such kind of access pat- tern, using the key-value model implemented in popular in-memory caches to store the fragments and ODG could lead to high cumulative overhead when creating a page. The KVL model is well fit for caching such kind of dataset since it is also a graph in nature. By representing items as fragments and use links to formulate the depen- dency between fragments, applications could construct the web page by simply chasing links between pages’ compo- nents. Since link chasing is much faster than item lookup, applying this model would reduce the cumulative latency significantly. 4.3 Intelligent transportation systems Intelligent Transportation Systems (ITS) act important roles in solving critical issues in urban areas such as con- gestion, air pollution, and safety of transit. The major problem of ITS system is that they have to manage a huge amount of data which is pushed to the system continuously from many sources like GPS, video stream, etc., in order to produce meaningful information in real-time. To do so, the digital map must be well organized since most of the critical operations such as map matching, routing, and con- gestion detection relies on it. As the map could be consid- ered as a network of points (e.g. intersections) and lines (e.g. streets), KVL model is a promising candidate for rep- resenting its content in ITS systems. 5 KELI: A KVL In-memory Store We have implemented an in-memory store utilizing the KVL model named KELI (stands for KEy-value-with-Link In-memory Store). Originally, we developed KELI while constructing a traffic condition monitoring system for Ho Chi Minh City (available at traffic.hcmut.edu.vn). The main role of KELI is to manage the metadata of the city map so that applications could quickly process GPS signals gener- ated by vehicles to produce meaningful information about the current traffic condition of the city [14]. Although KELI is originally designed for an ITS system, we feel its architecture is general enough for working with other ap- plications. So we extended its implementation to make it applicable to a wide range of applications. 5.1 System architecture Our objective when designing KELI is to provide a light- weight in-memory store for hardly-changed datasets stored in complicated (disk-based) databases. Particularly, KELI copies items stored in the database to memory and lets ap- plications to access them through its interface instead of sending requests directly to the database. The design of KELI also assumes that update occurs very rarely and a few changes do not cause serious impact on application perfor- mance and correctness. Figure 2 illustrates the overall architecture of KELI. Data is originally stored permanently on disk to ensure durability and availability. KELI is deployed entirely in memory. Af- ter starting up, KELI accesses data on disk and loads them into memory. During this process, items are transformed from their original format on disk to KVL format. After KELI finishes loading data from disk, data access could be redirected to KELI and the database now acts as a backup module. KELI does not support update operations (i.e. mod- ify, write, and delete) so if application want to change the content of data, it still has to send those requests to the database. Updates occurring at disk do not take effect im- mediately to the in-memory store. KELI, however, reloads the content of data on disk after predefined and fixed inter- vals. 5.2 Data layout Since DRAM capacity is much smaller than that of sec- ondary storage, utilizing memory space efficiently is a cru- cial requirement. To do so, avoiding/reducing fragmenta- tion is necessary as this is the primary source of low mem- ory utilization. S. M. Rumble et al. [20] showed that cur- rent standard dynamic memory allocators such as “malloc" 188 Informatica 41 (2017) 183–192 H.D. Nguyen et al. A B C D E A B C E D Hash Table Pre-allocated Memory Region Memory ... ... Listener Worker thread pool ... A 1. Get(C) 2. Thread-handoff 3. Lookup C 4. Data tramsmit Application 5. Read(CA) Data Node Figure 4: Request Processing in C do not handle this problem well. Therefore, to avoid fragmentation, we do not utilize dynamic memory alloca- tors to create rooms for data. Free space is instead reserved in advance in form of contiguous memory slots. KELI dra- matically fills them up with the content of new items. KELI updates its content periodically in batch-style. Ev- ery time the update process is triggered, KELI first allo- cates new memory regions for new items then fills them up with the content of data stored in the disk-based database. After that, it deallocates the memory regions of old data and uses items in the new memory regions for answering upcoming requests from applications. As the clients by- passes the server when chasing links, KELI must ensure applications do not access old items after the update took place by halting all active connections from the clients and have the clients reestablish those connections to the server to obtain the new content. Similar to key-value stores, KELI employs a hash table for tracking items by their key. We use the Cuckoo hash- ing [18] to implement the hash table since this technique ensures constant complexity in the worst case, guarantees stable performance with large datasets. The hash table does not hold the content of hashed items but it instead stores the pointer to the actual data. So for each new item, KELI first finds a slot in allocated memory regions for it and write its content to this slot. After that, item’s key and the pointer to the slot are added to the hash table. follow_list userABC name: ABC next followABC_0 follows followABC_1 follows follow_list userEDF name: EDF followEDF_0 follows Figure 5: Modeling Twitter datasets by the KVL model. Black boxes represent a list of links and each link refers to one item in the dataset. Gray boxes represent a single link. Some stores such as HBase [1] keep items in memory in form of memory objects to simplify data management. This approach, however, often requires the server and client to serialize/deserialize those objects from/to an array of bytes before transferring them over the network. Serial- ization adds significant overhead to request processing es- pecially in small ones like item lookups. Furthermore, if items contain pointers to different resources, chasing links would generate multiple RDMA Reads making the opera- tor inefficient. Therefore, KELI servers store items in form of arrays of bytes and let the client perform serialization. 5.3 Request processing reviewer_list productABC name: ABC Price: $100 ... product Review00001 user Text: good rating: 4 user0Abf12 name: ABC product Review02112 user Text: OK rating: 2 user0xZf7 name: XYZ (a) Fragments represented by KVL model. Top Customer Reviews he is really smart and he loves it By Hermann E. Wagner on October 5, 2016 Format: Hardcover I have been buying this book for 3 years Now, it never disappoints. It is my usual birthday gift for my now 12 yr old nephew, he is really smart and he loves it. It is the perfect gift for a curious mind! 4 people found this helpful Excellent reporting and easy reading. By Sandi on September 17, 2016 Format: Hardcover | Verified Purchase This is a Christmas gift to my son who has collected these books for years. He'll love it. Easy reading and pictures are lovely. 4 people found this helpful Product Guinness World Records 2017 Hardcover – August 30, 2016 by Guinness World Records (Author), Chris Hadfield (Introduction), Buzz Aldrin (Contributor) Price: $21.67 Description The ultimate annual book of records is back and crammed with more than ever before! Guinness World Records 2017 is bursting with all-new records on topics as diverse as black holes, domes, owls, and killer plants. Want to know the highest anyone has travelled on a skateboard, or the largest loop-the-loop completed in a car? Dying to know just how many tricks a cat can do in one minute? The answers to these questions and so much more are right inside. Read more Customer Reviews Product Review Review (b) Web Page content. Figure 6: Modeling a dynamic web page by the KVL model. Black boxes represent a list of links and each link refers to one item in the dataset. Gray boxes represent a single link. Key-Value-Links: A New Data Model for. . . Informatica 41 (2017) 183–192 189 In this subsection, we will show how KELI handle re- quests from clients. The whole process is shown in Figure 4. KELI communication modules are built upon IB verb programming model. Given a key, the client asks for its value by issuing a “get" request via “ib_send". The re- quest is received at the server side by a listener which has responsibility for receiving any incoming requests. In order to maximize KELI performance, the listener continuously asks input queue for new requests instead of passively wait- ing for the queue to inform it about the new message like traditional techniques. Although this approach wastes a lot of CPU cycle for polling input queue, it makes KELI re- spond to the new request very quickly. When the listener discovers a new request in the in- put queue, it then pops the request out and forwards it to a worker thread in the thread pool. Threads are chosen randomly to ensure load-balancing. After receiving a re- quest from the listener, chosen thread then searches for the needed item in the hash table. If the item is not found, it generates a response with empty payload and sends it back to the client using “ib_recv” operation. Otherwise, the hash table would return a pointer to the location of the item. Thread just simply follows the pointer, generates a non-empty response message, copies the content of the item to the payload of this message and sends the response back to the client (also using “ib_recv”). The client has responsibility for interpreting the mean- ing of the payload of the response message. If the item contains links to other items and application wants to re- trieve them, the client does not make another “get” request but using RDMA Read to directly read the content of the linked items from the server. Doing so significantly re- duces item loading latency since executing RDMA Reads is much cheaper than explicitly invoking an item lookup request (e.g. “get”). 6 Experiments 6.1 Experiment setup In this section, we will illustrate the benefits of employ- ing the KVL model for RDMA-based in-memory stores by comparing the performance of KELI with another in- memory key-value store. We choose Memcached for this task due to its popularity. In fact, to make the compari- son fair, instead of using the original version, we make use of an extended version of Memcached, which uses RDMA verbs for data transmission, for all experiments. [11, 10] The two stores are compared based on practical applica- tions. Particularly, we use the KVL model to represent sev- eral real-life datasets and let KELI manage them. We do the same task with Memcached except that links in items are replaced by the key of referred items. We then develop some applications implementing popular algorithms work- ing over those datasets. The data such applications need for computation is fetched from either KELI or Memcached. We measure the computation cost and use it to compare the two stores. 6.2 Data modeling We conduct experiments on three different real-life datasets, each associated with one problem listed in Sec- tion 4. In the text bellow, we will illustrate those datasets and describe how to use the KVL model to model them. For the key-value version, we just replace links by the key of item it pointing to. Social Network Graph traversal is very popular on the social network. For examples, given a user, find a per- son with a given name (e.g. “John”) among his friends, his friends’ friends, and so on is a typical problem mak- ing use of graph exploration. In the experiment, we will perform the Breadth-First-Search (BFS) over a real-life so- cial network dataset provided by Twitter. The dataset con- tains about one million nodes represent users and more than 22 million edges represent the followership between users. Figure 5 shows how the dataset is modeled by the KVL model. Clearly, this representation is similar to adjacent list data structure except for that edge (e.g. follower) lists are broken into multiple chunks since one user may have a lot of friends. If we integrate all of them into one item, this could enlarge the size of this item reasonable leading to performance degradation. In following experiments, we let each list contain at most 100 followers. Web Page Generation We construct a web page dis- playing information about the reviews of products sold by Amazon using the dataset provided by Amazon itself. The content of the page is dynamic as product information change frequently and users continuously update their re- views to products. We have to break the HTML file into multiple parts and change their content right after the up- date takes effect. Figure 6a shows the relationship between users, products, and reviews of users for some products and Figure 6a shows how the HTML file of the page lock and feel. Map Matching We choose map matching problems as a representative application for ITS systems. Given a GPS signal, we have to determine if this signal belongs to any street and if so, identify which place on the street it falling into. This problem is very popular in ITS system involving to real-time traffic monitoring, congesting detection, rout- ing, etc. In this experiment, we use a digital map provided by OpenStreetMap (OSM) to construct the datasets about streets in Ho Chi Minh City. Figure 7 illustrates an ex- ample of modeling the map by the KVL model. Partic- ularly, according to OSM’s format, a street is a polyline which is constructed by connecting multiple nodes (points). Since the street is a polyline and typically long, we do not map GPS signals with streets but with lines which are constructed by connecting two consecutive nodes on a street called segment. An item represents a segment will link to items containing the information about its endpoints 190 Informatica 41 (2017) 183–192 H.D. Nguyen et al. nodeIJK nodeUVT nodePQO streetDEF cellABC (a) Objects on the map. segments cellABC segments streetDEF name: DEF street segXYZ Endpoint0 cell longitude nodeIJK latitude latitude nodeUVT longitude Endpoint1 street segLMN Endpoint0 cell Endpoint1 latitude nodePQO longitude (b) Objects in KVL model. Figure 7: Modeling objects on digital map by the KVL model. Black boxes represent a list of links and each link refers to one item in the dataset. Gray boxes represent a single link. 64 256 1024 4096 Size (Byte) 0 20 40 60 La te nc y (M ic ro se co nd ) Look up Link Chasing RDMA-memcached Figure 8: The latency of link chasing and item lookup in experiments. (node). There are also links from streets to segments con- structed from their nodes. We group segments into disjoint areas called cells based on their geographical location. The map matching algorithm is quite simple: given a GPS sig- nal, the application first determines its spatial information (e.g. latitude and longitude) and uses them to identify the corresponding cell. It then issues the in-memory stores for this cell and then retrieves segments belonging to this cell to find out which segment this signal belongs based on their geographical locations. 6.3 Performance evaluation We conduct all experiment on two computers equipped with Intel Xeon E5-2670 and 32GB main memory. They are connected through an Infiniband connection using Mellanox’s ConnectX-3 40 Gbps NIC. The RDMA- Memcached in all experiments is based on Memcached version 1.4.24 and applications use libMemcached version 1.0.18 to communicate with the store. In order to fully un- derstand the effect of using KVL model, let us first com- 5-th Average 95-th0 8 16 24 La te nc y (M ill is ec on d) KELI RDMA-Memcached Figure 9: Map matching latency pare the read performance of KELI’s item lookup and link chasing with RDMA-Memcached’s get operation. Figure 8 shows the experiment results. Clearly, the naive imple- mentation of lookup using Send/Recv verbs performs very poorly. It takes about three to four times slower than the optimized version used by RDMA-Memcached. However, item lookup still executes two times longer than link chas- ing. Therefore, if applications make good use of link , KELI could perform better than RDMA-Memcached. Although KELI has to deserialize item content and check for consistency when chasing links, link chasing latency is just slightly slower to that of pure RDMA Read reported in Figure 3. This is because the time spent on communica- tion is the dominant cost of RDMA operators. So although KELI has to check for consistency and deserialize every item it reads, its latency is still lower than that of HERD. Also note that HERD’s lockup latency could be higher in practical as it sacrifices reliability and let applications take care of integrity checks to boost the lookup performance as much as possible. In the map matching experiment, we preload both KELI and RDMA-Memcached with about six million key-value Key-Value-Links: A New Data Model for. . . Informatica 41 (2017) 183–192 191 5-th Average 95-th0 400 800 1200 1600 La te nc y (M ic ro se co nd ) KELI RDMA-Memcached Figure 10: Web page construction latency 10 100 1000 10000 Number of vertices 0.1 0.5 1 23 La te nc y (S ec on d) KELI RDMA-Memcached Figure 11: Graph exploration pairs represent the geographical information of Ho Chi Minh City. Similarly, we prepare about 200 thousand re- views for more than 12 thousand products for web creation application and a graph with one million nodes and about 22 million edges for BFS traversal. Figure 9, 10, and 11 show the execution time of map matching, web page con- struction, and BFS algorithms, respectively, using KELI and RDMA-Memcached. Apparently, KELI outperforms RDMA-Memcached in all cases. In the case of map matching, KELI outperforms RDMA-Memcached by the factor of two in average. In the case of tail latency (95-th percentile), KELI still runs about 2.5 times faster than RDMA-Memcache. KELI also helps applications construct web pages 50% faster than RDMA does. Similarly, the implementation of BFS algo- rithm using KELI runs 75% faster than that using RDMA- Memcached. The reason behind this is that according to the data lay- outs we described in the previous section, applications uti- lizing KELI mostly uses link chasing for fetching new items. For example, in the case of graph traversal, the application only has to invoke item lookup for the first time when it has to retrieve the first vertex. After that, based on the “list” and “next” links integrated into each accessed vertex and edge lists, the applications could al- ways invoke link chasing to get information about vertex to be accessed. On the other hand, applications supported by RDMA-Memcached have no choice but item lookup to retrieve data. Since this operation is about two times lower than link chasing, KELI performs two times better than RDMA-Memcached. 7 Conclusion In this paper, we present KVL, an enhanced version of the key-value model for in-memory stores working over RDMA-capable networks. In this model, each data set is a network of key-value pairs linking to each other. Each link is a pointer to the address of the referred item and is integrated directly into the item. With this organiza- tion, the KVL model introduces a new operation named link chasing to allow applications to utilize RDMA Read to directly read items through links without interfering the data server. Our experiments have shown that this model is well fit many real-life applications. Also, by utilizing this model, KELI, an average in-memory store without much optimization could easily outperform an state-of-the-art in- memory store. . References [1] Hbase. https://hbase.apache.org/. Accessed: 2017- 03-03. [2] Memcached. https://memcached.org/. Accessed: 2016-11-07. [3] Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. Workload Analysis of a Large-Scale Key-Value Store. In ACM SIGMETRICS Performance Evaluation Review, volume 40, pages 53–64. ACM, 2012. [4] Nathan Bronson, Zach Amsden, George Cabrera, Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris, Anthony Giardullo, Sachin Kulkarni, Harry Li, et al. Tao: Facebook’s Distributed Data Store for the Social Graph. In Presented as part of the 2013 USENIX An- nual Technical Conference (USENIX ATC 13), pages 49–60, 2013. [5] Jim Challenger, Arun Iyengar, Karen Witting, Cameron Ferstat, and Paul Reed. A Publishing Sys- tem for Efficiently Creating Dynamic Web Content. In INFOCOM 2000. Nineteenth Annual Joint Con- ference of the IEEE Computer and Communications Societies. Proceedings. IEEE, volume 2, pages 844– 853. IEEE, 2000. 192 Informatica 41 (2017) 183–192 H.D. Nguyen et al. [6] Yanzhe Chen, Xingda Wei, Jiaxin Shi, Rong Chen, and Haibo Chen. Fast and General Distributed Trans- actions Using RDMA and HTM. In Proceedings of the Eleventh European Conference on Computer Sys- tems, page 26. ACM, 2016. [7] Giuseppe DeCandia, Deniz Hastorun, Madan Jam- pani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: Amazon’s Highly Available Key-Value Store. ACM SIGOPS Operating Systems Review, 41(6):205–220, 2007. [8] Aleksandar Dragojević, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. FaRM: Fast Remote Memory. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 401–414, 2014. [9] Bin Fan, David G Andersen, and Michael Kaminsky. MemC3: Compact and Concurrent Memcache with Dumber Caching and Smarter Hashing. In Presented as part of the 10th USENIX Symposium on Net- worked Systems Design and Implementation (NSDI 13), pages 371–384, 2013. [10] Jithin Jose, Hari Subramoni, Krishna Kandalla, Md Wasi-ur Rahman, Hao Wang, Sundeep Narravula, and Dhabaleswar K Panda. Scalable Memcached De- sign for Infiniband Clusters Using Hybrid Transports. In Cluster, Cloud and Grid Computing (CCGrid), 2012 12th IEEE/ACM International Symposium on, pages 236–243. IEEE, 2012. [11] Jithin Jose, Hari Subramoni, Miao Luo, Minjia Zhang, Jian Huang, Md Wasi-ur Rahman, Nusrat S Islam, Xiangyong Ouyang, Hao Wang, Sayantan Sur, et al. Memcached Design on High Performance RDMA Capable Interconnects. In 2011 International Conference on Parallel Processing, pages 743–752. IEEE, 2011. [12] Anuj Kalia, Michael Kaminsky, and David G Ander- sen. Using RDMA Efficiently for Key-Value Ser- vices. In ACM SIGCOMM Computer Communication Review, volume 44, pages 295–306. ACM, 2014. [13] Anuj Kalia, Michael Kaminsky, and David G Ander- sen. Design Guidelines for High Performance RDMA Systems. In 2016 USENIX Annual Technical Confer- ence (USENIX ATC 16), 2016. [14] Minh Duc Le, The De Vu, Duc Hieu Nguyen, Tien Hai Ho, Duc Hai Nguyen, Tran Vu Pham, et al. Keli: a key-value-with-links in-memory store for realtime applications. In Proceedings of the Sev- enth Symposium on Information and Communication Technology, pages 195–201. ACM, 2016. [15] Christopher Mitchell, Yifeng Geng, and Jinyang Li. Using One-Sided RDMA Reads to Build a Fast, CPU- Efficient Key-Value Store. In Presented as part of the 2013 USENIX Annual Technical Conference (USENIX ATC 13), pages 103–114, 2013. [16] Richard C Murphy, Kyle B Wheeler, Brian W Barrett, and James A Ang. Introducing the Graph 500. Cray User’s Group (CUG), 2010. [17] John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazières, Subhasish Mitra, Aravind Narayanan, Guru Parulkar, Mendel Rosenblum, et al. The Case for RAM- Clouds: Scalable High-Performance Storage Entirely in DRAM. ACM SIGOPS Operating Systems Review, 43(4):92–105, 2010. [18] Rasmus Pagh and Flemming Friche Rodler. Cuckoo hashing. In European Symposium on Algorithms, pages 121–133. Springer, 2001. [19] Lakshmish Ramaswamy, Arun Iyengar, Ling Liu, and Fred Douglis. Automatic Detection of Fragments in Dynamically Generated Web Pages. In Proceedings of the 13th international conference on World Wide Web, pages 443–454. ACM, 2004. [20] Stephen M Rumble, Ankita Kejriwal, and John K Ousterhout. Log-structured memory for dram-based storage. In FAST, volume 14, pages 1–16, 2014. [21] Bin Shao, Haixun Wang, and Yatao Li. Trinity: A Distributed Graph Engine on a Memory Cloud. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 505–516. ACM, 2013. [22] Yandong Wang, Li Zhang, Jian Tan, Min Li, Yuqing Gao, Xavier Guerin, Xiaoqiao Meng, and Shicong Meng. HydraDB: A Resilient RDMA-driven Key- Value Middleware for in-Memory Cluster Comput- ing. In Proceedings of the International Conference for High Performance Computing, Networking, Stor- age and Analysis, page 22. ACM, 2015. [23] Xingda Wei, Jiaxin Shi, Yanzhe Chen, Rong Chen, and Haibo Chen. Fast in-Memory Transaction Pro- cessing Using RDMA and HTM. In Proceedings of the 25th Symposium on Operating Systems Prin- ciples, pages 87–104. ACM, 2015. Informatica 41 (2017) 193–207 193 Defense Strategies against Byzantine Attacks in a Consensus-Based Network Intrusion Detection System Michel Toulouse, Hai Le and Cao Vien Phung Vietnamese German University, Vietnam E-mail: michel.toulouse, hai.lh@vgu.edu.vn, caovienphung@gmail.com Denis Hock Frankfurt University of Applied Sciences, Germany E-mail: dehock@fb2.fra-uas.de Keywords: network security, intrusion detection, consensus algorithms, Byzantine attacks Received: March 29, 2017 The purpose of a Network Intrusion Detection System (NIDS) is to monitor network traffic such to detect malicious usages of network facilities. NIDSs can also be part of the affected network facilities and be the subject of attacks aiming at degrading their detection capabilities. The present paper investigates such vulnerabilities in a recent consensus-based NIDS proposal [1]. This system uses an average consensus algorithm to share information among the NIDS modules and to develop coordinated responses to network intrusions. It is known however that consensus algorithms are not resilient to compromised nodes sharing falsified information, i.e. they can be the target of Byzantine attacks. Our work proposes two different strategies aiming at identifying compromised NIDS modules sharing falsified information. Also, a sim- ple approach is proposed to isolate compromised modules, returning the NIDS into a non-compromised state. Validations of the defense strategies are provided through several simulations of Distributed Denial of Service attacks using the NSL-KDD data set. The efficiency of the proposed methods at identifying compromised NIDS nodes and maintaining the accuracy of the NIDS is compared. The computational cost for protecting the consensus-based NIDS against Byzantine attacks is evaluated. Finally we analyze the behavior of the consensus-based NIDS once a compromised module has been isolated. Povzetek: Sistemi za odkrivanje napadov v omrežjih temeljijo na pojavih nenavadnega prometa, vendar so občutljivi na napade. Prispevek opisuje obrambo pred bizantinskimi napadi. 1 Introduction Network intrusion detection systems are part of a vast ar- ray of tools that protect computer infrastructures against malicious activities. The specific task of NIDSs is to mon- itor computer network infrastructures, seeking to identify malicious intends through the analysis of network traffic. Today’s computer networks are quite large, composed of several heterogeneous sub-networks. Consequently, traffic monitoring often needs to be done distributively with sen- sors and traffic analysis modules placed at different strate- gic locations, in charge of monitoring and analyzing the traffic of a specific sub-network. Usually, NIDS monitoring modules are connected by a network, allowing security information collected about a sub-network to be shared with other NIDS modules. Which information is shared and how it is shared often character- ize the organization of NIDSs as centralized, hierarchical or distributed [2]. The monitoring modules of centralized and hierarchical NIDS architectures, which can be limited to simply collecting data, send their information up in the hierarchy for further analysis. To the extend that analy- sis and responses depend on a single or few modules in the NIDS, these systems can be completely incapacitated by at- tacks that target the more intelligent modules. A common mitigation for these risks is to avoid a single point of fail- ure by using distributed Intrusion Detection Systems [3, 4]. Modules in distributed intrusion detection systems are of- ten full scale sensing and analytical devices. The modules cooperate by sharing information to address attacks from concurrent sources (such as distributed denial of service), to develop network wide coordinated responses to attacks or simply to increase the detection accuracy of each NIDS module. Early distributed systems [5, 6], where also build upon a master–slave architecture and require the data to be sent to a central location for further analysis. Today, us- ing peer-to-peer systems [7, 8, 9, 10], it is possible to rec- ognize attacks by analyzing shared information in a fully distributed manner. While it is more difficult to completely disable dis- tributed systems compared to centralized ones, modules of a distributed system can still be the target of attacks aim- ing to disable locally the system or to mask attacks in some sub-networks to other nodes of the distributed NIDS. The 194 Informatica 41 (2017) 193–207 M. Toulouse et al. present research addresses the vulnerability of a recently proposed fully distributed NIDS [1]. This system uses an average-consensus algorithm for computing network wide security information that can then be used to recognize at- tacks and activate coordinated responses to malignant ac- tivities. However, it is well known that consensus algo- rithms are not resilient to compromised nodes sharing fal- sified information, i.e. they can be the target of Byzantine attacks. Consensus algorithms are based on peer-to-peer commu- nications among neighbor nodes of a computer network (no routing). They are distributed iterative algorithms in which each node of the network repeatedly updates its cur- rent value based on its own previous value and the previous values of its neighbors in the network. The objective is to reach a "consensus", i.e. each node computes a same out- put that depends on initial values distributed across the net- work while using only local updates. Repeating such local computation, and given overlapping neighborhoods, a con- sensus eventually emerges by diffusion of local updates. Consensus algorithms have a long history in computer science where they provide solutions to distributed com- puting problems. For example, consensus algorithms solve the leader election problem, where processes must select one of them to coordinate tasks in a distributed system [11]. Consensus algorithms have also found applications or research interests in physics [12], process control [13], robotic [14], operations research [15], services at IoT edge nodes [16], not to mention its application in the controver- sial bitcoin currency [17]. Average consensus refers to a particular form of consen- sus where cooperative nodes compute the average sum of their initial values. Average consensus algorithms also have a wide range of applications, for example we find them recently in wireless network applications such as cooper- ative spectrum sensing in cognitive radio networks [18], distributed detection in wireless networks [19], sensor net- works [20]. Consensus algorithms vulnerabilities to sharing falsi- fied information have been known for a long time. Orig- inally, consensus algorithms solved the problem of reach- ing agreement assuming a non-faulty non-adversarial com- puting environment. In reality, links can fail, nodes can stop transmitting data to neighbors (faulty links, nodes) or nodes can transmit incorrect data, possibly falsified by an adversarial actor (Byzantine nodes). The resilience of con- sensus algorithms has been analyzed in the context of fault- tolerant systems in Lamport, Pease and Shostack [21, 22]. The problem of reaching consensus in faulty and adversar- ial environments became known as the Byzantine agree- ment problem [22]. The problem asks under which condi- tions consensus can be reached in the presence of Byzan- tine faults. In [21], it is proved that resilient consensus algorithms cannot be designed in a fully connected net- work (complete graph) of n processors if the number m of Byzantine nodes is 3m + 1 ≥ n. The Byzantine agree- ment problem in [22] refers specifically to attacks in which Byzantine nodes modify the initial values for which con- sensus is computed (data falsification attacks). Since then, the Byzantine agreement problem has been adapted to con- sider new failure conditions, i.e. different attack models, as well as quite diverse network settings. Research studies aiming at detecting Byzantine nodes are of a particular relevance to our work. In [23, 24], a tech- nique based on the detection of outliers is applied to find compromised nodes in a consensus-based spectrum sens- ing algorithm for ad hoc wireless networks. In [24], several attack models are proposed to subvert the spectrum sens- ing algorithm. One attack model is a covert adaptive data injection attack, which adjusts attack strategies by manipu- lating the sensing results. The proposed defense consists to isolate neighbor nodes that send numerical data that deviate too much from some norm. In [25], the detection of Byzan- tine nodes is derived from reputation-based trust manage- ment strategies. In this paper, one type of attacks consists of malicious robots injecting false data to neighbors in a multirobot system controlled by a consensus algorithm for the purpose of formation control. The proposed defense system consists to decrease the consensus weight contri- bution of a node that has its reputation drops during the computation of consensus states. Defense strategies against Byzantine attacks also originate from research in process control and control theory, fields where one of the focus is to provide methodological approaches to detect faulty com- ponents in a system. In [26, 27, 28] different approaches based on control theory are proposed to detect Byzantine attacks on consensus algorithms. In [26, 27], using model- based fault detection techniques, it is shown that if the net- work of consensus nodes is 2k + 1 connected then up to k Byzantine nodes can be identified. However model-based proposals for detecting multiple attackers seem computa- tionally costly, they likely have only limited applicability. Our work focuses on detecting Byzantine attackers in the consensus-based NIDS of [1]. One of our two detection techniques, outlier detection, derives from outlier methods monitoring applications of consensus algorithms to coop- erative spectrum sensing [24]. The second detection tech- nique, fault detection, derives from a model-based fault detection technique in process control and control theory [27]. We also introduce an approach to remove compro- mised NIDS modules such that the intrusion detection sys- tem can be returned to a non-compromised state. The removal of compromised modules conflicts with some mathematical assumptions about average consensus algorithms. Indeed, proofs that neighbor to neighbor data exchanges converge to a consensus are valid under the as- sumption the network is static. The removal of a compro- mised module changes the network topology of the NIDS, thus the system is no longer guarantee to work correctly even under normal circumstances (no attacks). Here, the relevant background research comes from dynamic consen- sus theory concerned with applications of consensus algo- rithms to dynamic network topologies, facing issues such as communication time-delays, failing physical links or Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 195 network nodes and mobile wireless networks [29, 30, 31]. While our objective is only to logically remove (isolate) a compromised NIDS module, the network conditions this is creating are quite similar to the work in [32, 33, 34], there- fore we have drawn our solution more specifically from these researches. All the solutions proposed in this paper are based on local knowledge. Decisions to categorize modules as compromised and to further removing the compromised module depend on information gathered from neighbors only. The computation to detect and remove compromised modules is therefore fully distributed, thus keeping the consensus-based NIDS fully distributed. Last, we have de- signed our defense strategies to protect the NIDS against a single compromised module. Detecting and removing multiple compromised modules following potentially co- ordinated attackers is left to a future work. The major contributions of this paper include present- ing the impact of malicious peers on the detection capa- bility of our consensus based Network Intrusion Detection Systems (NIDS) scheme. We analyze the vulnerabilities of consensus-based NIDS by proposing a Byzantine attack model, which aim to adjust and stealthily manipulate re- sults. Our defense strategies detect and remove compro- mised NIDS modules without impacting the logical func- tionality of the system. We compare these strategies un- der various detection parameters and network topologies through extensive simulations and analysis using a real NIDS and the NSL-KDD data set [35]. Our results demon- strate that the conducted method can indeed unveil peers with malicious intend and disruptions in the information exchange of peer-to-peer NIDS. In the remainder, we briefly describe the consensus- based NIDS in [1]. We point out variations of falsification attacks and outline our two detection techniques to adjust the trustworthiness of participating peers. Thereafter, we il- lustrate the salient features of our prediction model to iden- tify Byzantine peers and describe a practical experiment we conducted to showcase its functionality. 2 Consensus based NIDS This section describes the average consensus algorithm. Next, a summary of the consensus-based NIDS in [1] is provided. Lastly, we describe our approach to isolate com- promised modules together with the mathematical back- ground that supports this approach. 2.1 Average consensus The average consensus algorithm computes the average sum 1n ∑n i=1 xi of some initial values x1, x2, . . . , xn. It is a distributed algorithm where each process can be viewed as running independently on a particular node of an undi- rected graph. Let G = (V,E) be such a graph where V = {v1, v2, . . . , vn} denotes the set of nodes, and E de- notes the corresponding set of edges. Graphs have an ad- jacency structure represented by an n × n adjacency ma- trix (denoted by A here) where aij = 1 if and only if (vi, vj) ∈ E, aij = 0 otherwise. The adjacency struc- ture of G defines for each node vi ∈ G a neighborhood Ni where Ni = {vj ∈ V |(vi, vj) ∈ E}. Each node vi of G computes the following recurrence equation: xi(t+ 1) = Wiixi(t) + ∑ j∈Ni Wijxj(t), (1) where recurrence i is initialized with xi(0) = xi, the ini- tial value of each node i (from now on we denote node vi simply by i). The purpose of a consensus algorithm is to make "consensus", i.e. xi(t) converges asymptotically to 1 n ∑n i=1 xi for all nodes i ∈ G. As evidence from Equation (1), each node i obtains xi(t + 1) using only its previous value xi(t) and the previous values xj(t) of the nodes that are in the neighborhood of i (xj , j ∈ Ni). Nonetheless, all nodes converge to 1n ∑n i=1 xi because the diffusion of the local averages through neighborhoods that share common nodes accounts for all nodes computing the global average. Whether nodes reach consensus and which particular consensus value is reached is determined by the dynamics of the linear dynamical system that equation (1) specifies, which in turn depends on the transition matrixW . Each en- tryWij of matrixW represents a weight on edge (i, j) ∈ G. These individual weights have to be chosen carefully to en- sure convergence, and convergence to a specific value. For example, in equation (1), making consensus on 1n ∑n i=1 xi can be obtained by computing local averages of xi(t) and xj(t) for j ∈ Ni using Wij = 1|Ni|+1 for (i, j) ∈ G (in- cluding self-edge (i, i)). A system as in (1) can reach consensus if the weight matrix satisfies certain conditions, as stated in [36]. Two conditions concern our application of average consensus to network intrusion detection: 1- the undirected graph G needs to be connected, i.e. there is a path between each pair of nodes; 2- the weight matrix W must be row stochas- tic, i.e. ∑n j=1Wij = 1, the sum of the weights of each row equal 1 (note that for undirected graph, wij = wji, therefore W = WT , consequently the weight matrix W is doubly-stochastic, ∑n j=1Wij = ∑n i=1Wji = 1). Sev- eral weight matrices satisfy these conditions, the following matrices have been used for the consensus-based NIDS: – Metropolis-Hasting matrix: Wij =  1 1+max(di,dj) if i 6= j and j ∈ Ni 1− ∑ k∈Ni Wik if i = j 0 if i 6= j and j 6∈ Ni where di = |Ni|. – Best-constant edge weight matrix: Wij = 2 λ1(L) + λn−1(L) 196 Informatica 41 (2017) 193–207 M. Toulouse et al. BA 1 A 3 B C B B B D C C42 D D sensor sensor sensor sensor Figure 1: Network Intrusion Detection System. where L is the Laplacian matrix of the NIDS network, λ1, λn−1 are the first and n− 1 eigenvalues of L. – Local-degree weights matrix where the weight of an edge is the largest degree of its two adjacent vertices Wij = 1 max{di, dj} . – Max-degree weight where dmax is the largest degree of the vertices in the network Wij = 1 dmax . Note for the last three matrices, Wii = 1− ∑ k∈Ni Wik and Wij = 0 if j 6∈ Ni. Note also these weight matri- ces guarantee asymptotic convergence, x(t) converges to 1 n ∑n i=1 xi(0) as t → ∞, we refer to this average consen- sus algorithm as the asymptotic average consensus. Weight matrices have an impact on the speed of convergence (the number of iterations needed to get close enough to the av- erage sum) [37, 38]. Given an average consensus applica- tion, it worth to compare the convergence speed of differ- ent weight matrices to identify the one with the best per- formance. It worth noticing that the graph topology also impacts the convergence speed of average consensus algo- rithms [39]. 2.2 Consensus-based NIDS As pictured in Figure 1, a consensus-based NIDS is a set of modules each placed strategically on nodes of the moni- tored computer network such to observe traffic in the corre- sponding sub-network. Each module consists of traffic sen- sors that receive copies of all transported packets within the observed network and calculates an initial local probabil- ity for observing benign or malignant network traffic. The NIDS modules observing local network traffic are them- selves connected by a physical network. Without lost of generality, we assume that the physical links connecting pairs of NIDS modules are direct (wired or wireless) phys- ical links. The NIDS network is modeled by a graph where each node of the graph represents an NIDS module. It is assumed that this graph is connected. For the purpose of analysis and comparisons, we study specific topologies of NIDS networks, we refer to such specific network as an NIDS network topology. 2.2.1 Network traffic analysis The detection method of each NIDS module is "anomaly based" using the well-known naive Bayes classifier. The analysis focuses on detecting Distributed Denial of Service (DDoS) attacks, such as Land-attack, Syn-flood and UDP- storm. The naive Bayes classifier assess the statistical nor- mal behavior - the ’likelihood’ of a set of values to occur - with the help of labeled historic data. Our set of m fea- tures includes most of the variables offered by the NLS KDD data set, such as the number of bytes, service, and number of connections. The probabilities of intrusion is computed for each of these features. P (oj |h) expresses the likelihood of the occurrence oj given the historic anoma- lous ha or normal hn occurrences. Thus, if events receive the same values than benign or malignant network traffic during training, they result in a high probability for those. Assuming conditional independence of the m features, the joint likelihood P (Oi|h) of NIDS module i is the product of all feature likelihoods: P (Oi|h) = m∏ j=1 P (oj |h). (2) Each NIDS module locally assigns the joint likelihood, in- dicating the abnormality of each event. 2.2.2 Consensus phase Following the sensing and data analysis by the Bayesian network, each NIDS module enters into a phase where it computes the average sum of the n log-likelihoods: 1 n ∑n i=1 xi(0), while communicating only with direct neighbors. This phase is labeled as the NIDS consensus phase. Let xi(0) = log(P (Oi|h)) be the initial state of module i, where xi(0) is the likelihood for module i to see a certain set of network features. As explained in sec- tion 2.1, the average sum is computed iteratively and inde- pendently by each module i as a weighted sum of xi and the xj for j ∈ Ni as defined in equation (1). We iden- tify as the consensus loop the iterations of equation (1) and xi(t + 1) as the consensus value of module i at iteration t + 1. The consensus phase is the computation performed by the n consensus loops to reach consensus. This phase is defined in mathematical terms by the following dynamical system: x(t+ 1) = Wx(t), t = 0, 1, . . . (3) where x(t) is a vector of n entries denoting the n consensus values at iteration t of the consensus phase, and W is the weight matrix. The stopping condition of consensus loop i (also known as ’convergence parameter’ of the recurrence i) is given by Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 197 |xi(t+ 1)−xi(t)| < , i.e. when the change in the consen- sus value from iteration t to iteration t+ 1 is smaller than a pre-defined threshold value . For weight matrices satisfy- ing convergence assumptions, the value |xi(t+ 1)− xi(t)| decreases asymptotically as t → ∞, once this value is smaller than , the corresponding consensus loop is said to have converged. A consensus phase is completed once each consensus loop has converged. The number of itera- tions of a consensus phase is given by the consensus loop that needed the largest number of iterations to satisfy the stopping condition. The convergence speed of a consensus phase is the number of iterations needed for the consen- sus phase to complete. The value of  is set such to min- imize the number of iterations during the consensus phase while insuring accuracy of the decision about the state of the network traffic. The consensus phase is synchronous, all nodes must have completed the consensus loop at it- eration t before proceeding to execute the consensus loop iteration t+1. Finally, as a matter of implementation, once an NIDS module has converged, it stops updating its con- sensus value but continues to send the last updated value to its neighbors. 2.3 Removing compromised modules As discussed in the introduction, once a NIDS module j has been identified by a neighbor i as compromised, mod- ule j must be logically disconnected from i to maintain the integrity of the intrusion detection system. It is relatively simple to disconnect an NIDS module locally because the weight matrix W is known to each NIDS module i (or at least the weights associated to row i are known). Once a node i has identified a neighbor j as compromised, node i simply can apply the following change to the weight ma- trix: Wij = 0. Unfortunately, Wiixi(t) + ∑ j∈Ni Wij(t) no longer sum up to 1, then W fails to satisfy one of the two consensus convergence conditions. In order to fully address this issue, we have revisited the convergence proofs of average consensus, more specifically the convergence proofs for dynamic consensus (consensus under dynamic network topologies). The consensus algorithm described in Section 2.1 is a static consensus algorithm because the weight matrix stay unchanged during the consensus phase. The weight ma- trix (which is actually a weighted adjacency matrix) mir- rors the physical network topology underlying the NIDS. Static consensus cannot be used for applications where the underlying network topology is dynamic, i.e. where links or nodes fail, or where nodes enter and leave the network dynamically such as for wireless ad-hoc network. Dy- namic consensus theory formally addresses consensus con- vergence issues arising in dynamical networks. Dynamic consensus is relevant to our work as the impact on the NIDS of logically removing a compromised module is (model wise) the same as a failing node. The convergence theory of dynamic consensus is the mathematical support to our solution strategy for the removal of compromised nodes in a consensus-based NIDS. There are several avenues in control theory to address dynamic network, the work in [32, 33, 34] is directly related to our problem. As stated in section 2.1, the two convergence conditions the consensus phase of NIDS must satisfy are network con- nectivity and stochastic weight matrix. In [32], it is shown that the connectivity condition is surprisingly mild for dy- namic network topologies, the collection of dynamically changing topologies during the consensus phase only needs to be jointly connected to guarantee convergence. In our work this condition is always satisfied. Only one NIDS module is removed during a consensus phase, therefore the collection of network topologies is limited to two. For the NIDS network topologies tested in the experimentation section of this paper, each topology is connected. We have violation of convergence condition related to the stochastic weight matrix, this is fixed as followed. Once a node i has identified a neighbor j as compromised, node i set Wij = 0, thus locally removing the link (i, j). The situation where Wiixi(t) + ∑ j∈Ni Wij(t) < 1 from set- ting Wij = 0 is eliminated by increasing the weight of the self-edge by the same amount Wij : Wii = Wii + Wij . This solution is only implementable if the information for updating the weights of row i in W can be computed lo- cally. This is the case for the Metropolis-Hasting weight matrix as the weight of each edge depends on the degree of adjacent nodes. This will not work for weight matri- ces like the best-constant edge weight matrix or the max- degree weight matrix where the weights depend on global information (such as the max-degree node in the network). Tests in this paper where nodes are logically disconnected in a NIDS network topology are based on the Metropolis- Hasting weight matrix. 3 Detection of Byzantine attacks Byzantine attacks aim at degrading the accuracy of the net- work intrusion detection system. Accuracy is defined as follow: TP + TN TP + TN + FP + FN , where TP (True Positive) is the number of attacks detected when it is actually an attack; TN (True Negative) is the number of normals detected when it is actually normal; FP (False Positive) is the number of attacks detected when it is actually normal; FN (False Negative) is the number of nor- mals detected when it is actually an attack. Byzantine at- tacks of NIDS modules can aim at masking malicious traf- fic by decreasing the probability of attacks initially com- puted by the naive Bayesian. Attackers may also increase the probability of attacks computed by the naive Bayesian, thus increasing the number of false positives, the reliability of the system is then questioned by the system administra- tors. This section first provides an attack model on the con- sensus phase, this model is used by tests conducted in the 198 Informatica 41 (2017) 193–207 M. Toulouse et al. Figure 2: Convergence speeds with and without loop dis- ruption. next section. Second, two techniques are described which aim at identifying compromised NIDS modules. 3.1 Byzantine attack model Byzantine attacks on the consensus phase of consensus- based intrusion detection algorithms can take the following forms [27, 40]: 1. Data falsification attacks: Sensor values are falsified, thus the consensus loop is initialized with values orig- inating from falsified network traffic readings; 2. Consensus loop disruptions: (a) the attacker ignore the consensus value com- puted at each iteration and keeps transmitting the same constant c; (b) the attacker send to its neighbors a falsified con- sensus value [27]. Figure 2 illustrates the impact of a type 2(a) attack on the convergence speed of the consensus phase. It plots the distribution of the convergence speed of 1000 consensus phases each having only honest NIDS modules (No attack) versus a scenario where each consensus phase has one com- promised module sending the same constant value c to its neighbors (Attack). Figure 2 shows that convergence speed is much slower in a compromised system, each consensus phase needing between 250 to 300 iterations to converge, while in a system without a compromised module consen- sus phases need between 40 to 125 iterations to converge. Moreover, NIDS modules in a compromised system fail to converge to the average consensus 1N ∑N i=1 xi(0), rather they all converge to c [41]. In this paper we seek to discover consensus loop disrup- tion attacks of type 2(b). Equation (4) below models this type of attacks inside the consensus loop of a compromised NIDS module: xj(t+ 1) = Wjjxj(t) + ∑ i∈Nj Wjixi(t) + uj(t). (4) This recurrence equation is similar to equation (1) excepts for the variable uj(t) which models the value selected by the attacker for modifying the true consensus value of the compromised node j. The falsified consensus value xj(t + 1) is sent to all the neighbors of node j at iteration t + 1. Other Byzantine attack models, including multiple colluding attackers, are described in [24, 42]. 3.2 Detection techniques We describe two detection techniques that handle consen- sus loop disruptions of type 2(b) by a single Byzantine at- tacker. The first detection technique is an outlier detection procedure. This procedure is executed by each module i and evaluates at each consensus loop iteration the potential that a neighbor of module i is compromised. The second detection technique is an adaptation to cyber-attacks of a model-based fault-detection technique in process engineer- ing and control theory. Like the first one, it is a procedure executed by each module, observing its neighbors, seeking to identify a compromised one. 3.2.1 Outlier detection Outlier detection techniques have been applied to detect Byzantine attacks in wireless sensor networks [43]. These techniques use distance thresholds between the value xj(t) sent by a neighbor j to node i and some reference value ri. For example, if ri(t) = xi(t), neighbor j is flagged as compromised if |xj(t) − xi(t)| > λ for some threshold value λ. However this idea had to be refined. For exam- ple, a unique predefined threshold for all nodes may eas- ily be discovered by intruders. Furthermore, as nodes of a consensus algorithm converge to a same value, the abso- lute differences |xj(t) − xi(t)| between two nodes i and j converge to zero as t → ∞, rendering the outlier detec- tion potentially insensitive when the absolute differences get smaller than λ. Adaptive thresholds have been proposed to address the above issues [23, 24]. It consists for each node i to com- pute a local threshold λi and to adapt the threshold at each consensus iteration to the reduction of absolute differences |xj(t)− xi(t)|. In [23], the threshold λi(t+ 1) = ∑ j∈Ni |xj(t+ 1)− xi(t+ 1)|∑ j∈Ni |xj(t)− xi(t)| λi(t) (5) (for properly initialized λi(0)) is computed by each node i and at each iteration of the consensus phase. The rule in equation (5) computes λi using the diffusion dynam- ics of consensus algorithms, so unless the attacker can get multi-hops information access, it cannot foresee the value of its neighbor thresholds. Consequently, the attacker can- not adapt its consensus loop disruption attack to keep the values under the radar of the detection procedure. As the network converges towards consensus, the value λ con- verges toward zero, leading to the attackers to be eventually filter out. Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 199 Note that λi(t) partitions neighbors of node i into two sets, those neighbors j that have a deviation |xj(t) − xi(t)| ≥ λi(t) are considered suspicious, they constitute the neighborhoodNFi of states that have less weight in the computation of the consensus value xi(t+ 1): xi(t+ 1) = xi(t) +  ∑ j∈NTi xj(t) +  a ∑ j∈NFi xj(t) for some constant a. Our outlier detection method com- putes the threshold λ as in equation (5). Those neighbors j that have a deviation |xj(t)− xi(t)| ≥ λi(t) are flagged as suspicious. We use a majority rule similar to [24] to convert the status of a neighbor NIDS module j from suspicious to attacker. Let B be the number of common neighbors be- tween module i and module j. If more then dB2 e neighbors of module i report j as suspicious then module j is con- sidered as compromised, it is disconnected/removed from the intrusion detection system. Note that we assume a sin- gle attacker, if the majority rule identifies more than one neighbor as compromised, the one with the largest devia- tion is disconnected from the NIDS network. 3.2.2 Model-based fault detection Fault detection is a field of control engineering concerned with identifying and locating faulty components in a sys- tem. The techniques in this field essentially compare mea- surements of the actual behavior of a system with its an- ticipated behavior. In model-based fault detection, the an- ticipated behavior is described using mathematical mod- els [44], the measured system variables are compared with their model estimates. Comparisons between the system and the model show deviations when there is a fault in the real system. Such difference between the system and its model is called residual or residual vector. There exist several implementations of the model-based approach, the observer-based technique [45] 1- seeks to discriminate be- tween deviations caused by faults in the real process from those caused by the estimations; 2- provides a residual vec- tor that indicates the faulty system components (so called directional residual). Observer-based approaches to cyber- security have been proposed recently in different contexts [46, 47, 48, 49, 50]. We focus more specifically on appli- cations of observer-based fault detection to identify Byzan- tine attackers in consensus-based algorithms [26, 27]. In order to detect Byzantine attackers during the consen- sus phase of the NIDS, the design of the consensus loop of each NIDS module is modified to include new matrices based computation that estimate the consensus state vector x(t), we name observer this new function of the consensus loop. At each iteration of the consensus loop, the observer computes a state vector xo(t) estimating x(t), where x(t) is the vector storing the consensus values at iteration t of the consensus loop. We first model the consensus loop dis- ruption attack of equation (4) in matrix form: x(t+ 1) = Wx(t) + Inu(t) (6) where In is the n-dimension identity matrix, and where ui(t) = 0 whenever NIDS module i behave normally. The observer requires inputs from the state vector x(t), i.e. the values xj(t) ∈ x(t) where j ∈ Ni. These values are stored in a vector yi. The consensus loop of each NIDS module i is now defined as follow: x(t+ 1) = Wx(t) + Inu(t) yi(t) = Cix(t) (7) where Ci is a (degi + 1) × N matrix in which entry Ci[k, l] = 1 if l ∈ Ni, otherwise Ci[k, l] = 0. The vec- tor yi(t) has (degi + 1) entries, each entry j of yi(t) stores the state xj(t) at time t of modules j ∈ Ni. Equation (7) represents the consensus loop of a given module i as if it could access all the consensus values at iteration t, though in fact module i can only access xi(t) and xj(t) for j ∈ Ni. The other entries of vector x(t) are not needed during the computation performed by the re- currence relation of node i, so it is not incorrect to model these entries as if they were available. Note that each NIDS module i knows the consensus matrix W , the ma- trix Ci, xj(t) ∈ x(t) for j ∈ Ni, and the identity ma- trix In. However, the set of non-zero ui is unknown to the non-malicious modules. To detect a malicious neighbor of module i, the consensus loop of "each module" computes the following matrix operations [27]: z(t+ 1) = (W +GCi)z(t)−Gyi(t) xo(t) = Lz(t) +Kyi(l) (8) where z(t) is the state of the observer and xo(t) is the esti- mation by the observer of module i of the consensus state x(t). The matrices to compute z(t + 1) and xo(t) are de- fined as follow: G = −WNi , K = CTi , L = In − KCi, where WNi are the columns of W with indexes in Ni. The system in (8) has roots in the observability theory of con- trol theory, a detailed analysis of this system is beyond the scope of this paper, we refer to [45] for an historical de- velopment and analysis of observer-based fault detection systems. The analysis of (8) can be simplified as the con- sensus system (7) satisfies some conditions [26]. It can be show that as t → ∞, xo(t) → x(t), consequently the esti- mation error e(t) = xo(t) − x(t) converges to 0. We are also given that equation (8) under the consensus system in (7) simplifies to: xoj(t) = { xj(t) if j = i or j ∈ Ni zj(t) otherwise (9) and that the state of the observer z(t+ 1) can be expressed in terms of the consensus matrix [27]: z(t+ 1) = Wxo(t). (10) The iteration error ε(t): ε(t) = |xo(t+ 1)−Wxo(t)| 200 Informatica 41 (2017) 193–207 M. Toulouse et al. can then be used as residual vector. From (9) and (10), εj(t) = 0 for j 6= i and j 6∈ Ni. If εj(t) 6= 0, either xoj(t) 6= xj(t) (estimation error is greater than 0), or uj 6= 0. Since the estimation error dissipates as t→∞, we have (xo(t+ 1)−Wxo(t))→ Inu(t) as t→∞. If uj 6= 0 for some j ∈ {1, . . . , n} then (xoj(t+ 1)−Wxoj(t))→ uj(t), the corresponding module j is detected as compromised. Together with the consensus loop in (7), the observer defined in (8) provides an algorithm where each NIDS module can detect whether one of its neighbor sends fal- sified consensus data. Each module i build a consensus system and an observer as described in equations (7) and (8). At each consensus iteration, each module i computes ε(t) = |xo(t + 1) − Wxo(t)|. If εj(t) 6= 0 then mod- ule j ∈ Ni is compromised. Module i then removes log- ically module j from its neighborhood by modifying it’s weight matrix according to the description in Section 2.3, thus stopping the injection of an external input by module j into the network intrusion detection system. 4 Empirical analysis The above two Byzantine attacks detection techniques help the NIDS coping with adversarial environments by detect- ing compromised modules. In this section we analyze and compare the behavior of each technique. For example, these techniques have a computational cost, we measure the overhead for running each technique. We measure how fast attacks are detected and model the accuracy of the de- cisions made by the NIDS under each detection method. Last, as the removal of a compromised module is obtained by changing the weight matrix and the network topology, we measure whether these changes have any impact on the convergence speed of the consensus phases, i.e. whether the system returns to its full functioning capabilities after removing compromised modules. To execute this empirical analysis, the two Byzantine at- tack identification techniques described in Section (3) have been coded as part of the consensus phase of the NIDS sim- ulations described in [1]. We have run simulations for the following NIDS network topologies: rings with 9 and 25 nodes (NIDS modules), 2-dimensional torus with 9 and 25 nodes, Petersen graph (10 nodes 15 edges) and several ran- dom graphs having the same number of vertices and edges as in the Petersen graph. A simulation consists to execute 1000 iterations of one of the above NIDS network topolo- gies. In one iteration, each NIDS module of the network topology reads the local network traffic from an entry of the NSL-KDD data set, performs an Bayesian analysis of the local traffic, then executes its consensus loop until con- vergence. Note, we have filtered attacks in the NSL-KDD data set to retain only denial of service attacks. The consensus phase is implemented as follow. The Bayesian analysis of the local network traffic by module i returns two values: pAi the probability that the observed traffic at module i is intrusive; pNi the probability the ob- served traffic at module i is normal. These values are used to initialize the consensus loop of the corresponding module i: xAi (0) = log(pAi) and x N i (0) = log(pNi), for i = 1..n. During the consensus phase, for simula- tions involving the outlier detection technique, each NIDS module i computes the following recurrence relations until |xAi (t+ 1)− xAi (t)| <  and |xNi (t+ 1)− xNi (t)| < : xAi (t+ 1) = Wiix A i (t) + ∑ j∈Ni Wijx A j (t) + ui(t) (11) xNi (t+ 1) = Wiix N i (t) + ∑ j∈Ni Wijx N j (t) + ui(t). (12) Similarly, in simulations involving the fault detection tech- nique, each NIDS module i computes the solutions for the following iterative systems until each recurrence of the sys- tems has converged: xA(t+ 1) = WxA(t) + IuA yAi (t) = Cix A(t) (13) xN (t+ 1) = WxN (t) + IuN yNi (t) = Cix N (t). (14) The matrix operations described in Equation (8) are also computed at each iteration of the consensus loop of module i in simulations involving the fault detection technique. Once a consensus phase is completed, each NIDS mod- ule i decides whether to raise an alert or not based on its consensus approximation ratio x A i (t) xNi (t) of the actual ra- tio ∑N i=1 log(pAi ) n / ∑n i=1 log(pNi ) n and some predefined alert value ratio. As each module converges asymptotically to the same actual ratio ∑N i=1 log(pAi ) n / ∑n i=1 log(pNi ) n , all mod- ules reach a same decision, which constitutes a form of co- ordinated response to perceived anomalies in the network traffic. The consensus loop disruption of the attack model 2(b) in Section 3.1 is implemented as follow. Anomaly-based intrusion detection systems tend to have high false positive rates. We simulate attacks that aim to further increase the number of false positives. Attacks inject positive values in the consensus loop component (11) or (13). At each iter- ation of a simulation, a to be compromised NIDS module j is selected randomly, uAj is then assigned with a posi- tive value. The magnitude of uAj has to be large enough to falsify the decision at the end of the consensus phase (i.e. raise an alert when traffic is normal), "if" the con- sensus loop disruption attack is not detected. For exam- ple, uAj = 0.0005 is to small, it does not have an impact on the decision. However, a value such as uAj = 0.5 can cause each module of the system to converge to an approxi- mation x A i (t) xNi (t) > ∑N i=1 log(pAi ) n / ∑n i=1 log(pNi ) n , thus possibly leading the NIDS to raise an alert when in fact there is no attack. The value uAj = 0.5 is also suitable to obtain meaningful test results for the following reason. The Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 201 values pAi and pNi returned by the Bayesian anal- ysis of a module i are the product of likelihoods∏m j=1 P (oj |h), as the number of features is large, the product of likelihoods are very small. During the con- sensus phase, neighbor NIDS modules exchange log- likelihoods, which are in the range between -20 and -55. So uAj = 0.5 is a relativity small external input dur- ing the consensus phase. However, it is large enough so that our two detection techniques can always detect this attack, but failing to detect it soon enough can lead the consensus phase to converge to values quite different from 1 n ∑n i=1 xi(0). In the following sections, we first evaluate the computa- tional cost of running each of the two detection techniques. Subsidiary, we also report the number of consensus itera- tions needed to detect a compromised module. Next we analyze the efficiency of the detection techniques to pre- vent the occurrence of false positives at the conclusion of a consensus phase. Finally we analyze the impact of our technique to remove a compromised NIDS module on the convergence speed of consensus phases. 4.1 Computational costs Table 1 reports the computational cost of running each de- tection technique. All the simulations are executed while no attack take place, these tests measure uniquely the over- head for running the code implementing the two detection techniques. The column "Cost" reports the time in millisec- onds for running the NIDS network simulation for 1000 it- erations. In Table 1, rows "no detection" give the cost of running a NIDS simulation without the execution of any detection code. Rows "outlier" and "fault" give the cost of running NIDS modules while also executing respectively the code for the outlier method and the fault method. The higher costs of the detection techniques compared to "no detection" for the same network size and topology reflects the cost for protecting the consensus-based NIDS with the corresponding detection techniques. Table 1 shows that the computational overhead for outlier is clearly less than for the fault detection method. These results were expected, each consensus loop iteration of the fault detection method runs several matrix operations compared to simple scalar operations for the outlier method. 4.2 Detection speed Figures 3 to 8 detail with which rapidity, detection speed, the two detection techniques identify compromised mod- ules. Each figure corresponds to a different network topol- ogy. The values on the x axis are the number of consen- sus iterations needed before the compromised module is identified. The y axis displays the percentage of the 1000 consensus phases that needed a given number of consen- sus iterations to detect a compromised module. These fig- ures clearly show that the fault detection approach needs fewer iterations to detect Byzantine attacks. Combining the Table 1: Consensus-based NIDS computational simulation costs in milliseconds. Topology Size Detection Cost Ring 9 no detection 0.050 outlier 0.276 fault 0.921 25 no detection 0.101 outlier 1.131 fault 3.286 Torus 9 no detection 0.027 outlier 0.121 fault 1.327 25 no detection 0.043 outlier 0.567 fault 6.055 Petersen 10 no detection 0.005 outlier 0.135 fault 0.597 Random 10 no detection 0.013 outlier 0.290 fault 1.268 computational cost in Table 1, we observe that the outlier method has a more favorable computational overhead but requires more iterations to detect compromised modules. Figure 3: Detection speed of ring topology 9 nodes. Figure 4: Detection speed of ring topology 25 nodes. 202 Informatica 41 (2017) 193–207 M. Toulouse et al. Figure 5: Detection speed of torus topology 9 nodes. Figure 6: Detection speed of torus topology 25 nodes. Figure 7: Detection speed of Petersen graph. 4.3 Intrusion detection accuracy Disruption of the consensus loops by injecting external in- puts has an impact on the accuracy of the decision made by the NIDS about the state of the network traffic. Table 2 measures how effective the two detection techniques are at maintaining the accuracy of the consensus-based NIDS. The "no attack" rows report the accuracy of the NIDS in a non-adversarial environment. The "no detection" rows report the accuracy of the NIDS when attacks take place Figure 8: Detection speed of Random graphs. while the NIDS is not protected. The "outlier" and "fault" rows report respectively the accuracy of NIDS protected by the outlier and fault detection methods. The results of Table 2 are obtained without changing the weight matrix and network topology once a compromised NIDS module is identified (static consensus). Let l be the iteration of the consensus phase where module i identifies a neighbor module j as compromised. For t > l, module i applies the following update rule: xAi (t+ 1) = Wiix A i (t) + ∑ (k∈Ni∧k 6=j) Wikx A k (t) +Wijx A j (t)− 0.5. This update is possible since uAj = 0.5 is known in the con- text of our simulations, though it is not know which module is compromised in this way. Module i removes 0.5 from the value sent by the compromised module j, therefore mod- ule i update its state with the true consensus values of its neighbors. Table 2 shows the outlier detection method less ac- curate compared to the fault-detection method, this even Byzantine attacks are always detected and the compro- mised NIDS module neutralized. These results are ex- plained by the number of consensus loop iterations needed to detect attackers. Figures 3 to 8 show the outlier method needing more iterations to detect compromised modules. The more iterations it takes to detect a compromised mod- ule, the more data injections take place prior to detection of the compromised module and the more time injected values have to diffuse across the NIDS modules, which cause the NIDS decision to till the wrong way more frequently in the case of the outlier method. 4.4 Convergence speed This section analyzes the impact of removing a module while the NIDS computes consensus states. According to section 2.3, the technique we propose to isolate a compro- mised module satisfies the average consensus convergence conditions. Rows of the weight matrices sum up to 1. Each Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 203 Table 2: Accuracy of the NIDS. Topology Size Detection TP TN FP FN Ring 9 no attack 466 520 14 0 no detection 521 0 479 0 outlier 522 404 74 0 fault 456 525 4 15 25 no attack 527 473 0 0 no detection 475 0 525 0 outlier 497 503 58 0 fault 506 489 0 5 Torus 9 no attack 493 491 16 0 no detection 499 0 501 0 outlier 495 438 67 0 fault 478 511 0 11 25 no attack 492 507 1 0 no detection 497 0 503 0 outlier 491 456 53 0 fault 518 450 32 0 Petersen 10 no attack 501 487 12 0 no detection 481 0 519 0 outlier 477 458 65 0 fault 481 516 0 3 Random 10 no attack 451 533 16 0 no detection 485 0 515 0 outlier 503 432 65 0 fault 526 464 0 10 NIDS network topology tested in this empirical analysis section is such that it is still connected even after one mod- ule is removed. However, as our approach changes the weight matrix and the NIDS network topology, two factors that could impact the convergence speed, we still need to analyze the consensus phase convergence speed when mod- ules are removed. In this section we compare the consen- sus phase convergence speed of the static consensus imple- mentation of Section 4.3 running the Metropolis-Hasting weight matrix with the convergence speed when the con- sensus phase is implemented with the dynamic consensus procedure introduced in Section 2.3. Figures 9 to 13 compare the convergence speed of static versus dynamic consensus for the outlier detection method while figures 14 to 18 compare the convergence speed of static versus dynamic consensus for the fault detec- tion method. As we can see from figures 14 to 18, there is no significant differences in the convergence speed of static and dynamic consensus for the fault-based detection method, except for the Petersen graph. With the outlier de- tection method, as shown in figures 9 to 13, dynamic con- sensus converges faster for some of the network topologies. It is not entirely clear why the convergence speed is bet- ter with dynamic consensus and some specific outlier simu- lations. Nonetheless, figures 9 to 18 show no significant de- crease in the convergence speed of consensus phases once a module has been isolated. As accuracy is not impacted by the removal of a module, this is enough to conclude that the intrusion detection system returns to a fully functioning state. Figure 9: Convergence: dynamic topology, outlier detec- tion, ring topology 9 nodes. Figure 10: Convergence: dynamic topology, outlier detec- tion, ring topology 25 nodes. Figure 11: Convergence: dynamic topology, outlier detec- tion, torus topology 9 nodes. 5 Conclusion Local data exchanges of consensus-based distributed ap- plications can be hacked by Byzantine attackers falsifying computed consensus information. Several solutions have been proposed in the literature that address Byzantine at- tacks on consensus algorithms. We have adapted two of these solutions, one from model-based fault detection and one from outlier detection to protect a consensus-based net- 204 Informatica 41 (2017) 193–207 M. Toulouse et al. Figure 12: Convergence: dynamic topology, outlier detec- tion, torus topology 25 nodes. Figure 13: Convergence: dynamic topology, outlier detec- tion, Petersen graph. Figure 14: Convergence: dynamic topology, fault detec- tion, ring topology 9 nodes. work intrusion detection system. We have also applied re- sults from dynamic consensus theory to derive a simple ap- proach to isolate compromised modules from the network while continuing to satisfy the mathematical assumptions requested for convergence of the consensus phase. Our results show the two methods we propose can be used to detect consensus loop disruptions and prevent falsi- fications of NIDS network traffic assessments. Though pre- liminary, our results also show significant computational Figure 15: Convergence: dynamic topology, fault detec- tion, ring topology 25 nodes. Figure 16: Convergence: dynamic topology, fault detec- tion, torus topology 9 nodes. Figure 17: Convergence: dynamic topology, fault detec- tion, torus topology 25 nodes. costs of these approaches either in terms of the number of iterations to detect attacks (outlier detection) or in terms of the computational cost of each iteration (model-based de- tection). This might raise issues for deploying consensus- based NIDS in suitable environments such as wireless ad hoc networks. Future work will address both protecting the consensus- based NIDS against disruptive attacks as well as getting the Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 205 Figure 18: Convergence: dynamic topology, fault detec- tion, Petersen graph. system closer to deployment in wireless network environ- ments. We will work on reducing computational cost, by speeding up for example the consensus phase, i.e. reduc- ing the number of iterations needed for modules to come with agreed decisions. This will impact Byzantine fault detection which will need to be done at earlier stages and at a smaller computational cost. Byzantine fault detection will be broaden to other attack models, involving more than one compromised module, and possibly colluding attack- ers. Addressing multiple attackers seems achievable with- out too much research efforts using outlier or reputation- based methods. On the other hand, current model-based approaches in control theory seem too computationally de- manding and will need more research before they can be used in a deployed system. Finally, we intend to broaden the cooperation among NIDS modules. This depends on consensus computing more functions of the initial values provided by the analysis phase. There is a wide range of functions that can be computed using distributed iterative methods similar to average consensus. This will bring more versatility in detecting network intrusions and allow for a wide range coordinated responses to address detected ma- licious network activities. References [1] M. Toulouse, B. Q. Minh, and P. Curtis, “A consensus based network intrusion detection system.” in IT Convergence and Security (ICITCS), 2015 5th International Conference on. IEEE, 2015, pp. 1–6. [Online]. Available: http://dblp.uni-trier.de/db/conf/ icitcs/icitcs2015.html#ToulouseMC15 [2] A. Patel, M. Taghavi, K. Bakhtiyari, and J. Ce- lestino JúNior, “Review: An intrusion detection and prevention system in cloud computing: A systematic review,” J. Netw. Comput. Appl., vol. 36, no. 1, pp. 25–41, Jan. 2013. [Online]. Available: http://dx.doi.org/10.1016/j.jnca.2012.08.007 [3] C. V. Zhou, C. Leckie, and S. Karunasekera, “A survey of coordinated attacks and collaborative in- trusion detection,” Computers & Security, vol. 29, no. 1, pp. 124 – 140, 2010. [Online]. Avail- able: http://www.sciencedirect.com/science/article/ pii/S016740480900073X [4] E. Vasilomanolakis, S. Karuppayah, M. Mühlhäuser, and M. Fischer, “Taxonomy and survey of collab- orative intrusion detection,” ACM Comput. Surv., vol. 47, no. 4, pp. 55:1–55:33, May 2015. [Online]. Available: http://doi.acm.org/10.1145/2716260 [5] S. R. Snapp, J. Brentano, G. V. Dias, T. L. Goan, L. T. Heberlein, C. L. Ho, K. N. Levitt, B. Mukher- jee, S. E. Smaha, T. Grance et al., “Dids (distributed intrusion detection system)-motivation, architecture, and an early prototype,” in Proceedings of the 14th national computer security conference, vol. 1. Cite- seer, 1991, pp. 167–176. [6] T. Bass, “Multisensor data fusion for next generation distributed intrusion detection systems,” in In Pro- ceedings of the IRIS National Symposium on Sensor and Data Fusion, 1999, pp. 24–27. [7] R. Janakiraman, M. Waldvogel, and Q. Zhang, “In- dra: A peer-to-peer approach to network intrusion detection and prevention,” in Enabling Technologies: Infrastructure for Collaborative Enterprises, 2003. WET ICE 2003. Proceedings. Twelfth IEEE Interna- tional Workshops on. IEEE, 2003, pp. 226–231. [8] C. V. Zhou, S. Karunasekera, and C. Leckie, “A peer- to-peer collaborative intrusion detection system,” in 2005 13th IEEE International Conference on Net- works Jointly held with the 2005 IEEE 7th Malaysia International Conf on Communic, vol. 1, Nov 2005, pp. 118–123. [9] M. Locasto, J. J. Parekh, A. D. Keromytis, and S. J. Stolfo, “Towards collaborative security and p2p in- trusion detection,” in In Proceedings of the IEEE In- formation Assurance Workshop (IAW, 2005, pp. 333– 339. [10] M. Marchetti, M. Messori, and M. Colajanni, Peer-to-Peer Architecture for Collaborative Intrusion and Malware Detection on a Large Scale. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009, pp. 475–490. [Online]. Available: http://dx.doi.org/10. 1007/978-3-642-04474-8_37 [11] N. A. Lynch, Distributed Algorithms. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1996. [12] T. Vicsek, A. Czirók, E. Ben-Jacob, I. Cohen, and O. Shochet, “Novel type of phase transition in a sys- tem of self-driven particles,” Phys. Rev. Lett., vol. 75, pp. 1226–1229, Aug 1995. [Online]. Available: http://link.aps.org/doi/10.1103/PhysRevLett.75.1226 206 Informatica 41 (2017) 193–207 M. Toulouse et al. [13] R. Saber and R. Murray, “Consensus protocols for networks of dynamic agents,” in American Control Conference, 2003. Proceedings of the 2003, vol. 2, June 2003, pp. 951–956. [14] A. Fagiolini, M. Pellinacci, M. Valenti, G. Dini, and A. Bicchi, “Consensus-based distributed intrusion de- tection for multi-robot systems,” in Proc. IEEE Int. Conf. on Robotics and Automation, 2008, pp. 120 – 127. [15] J. Tsitsiklis, D. Bertsekas, and M. Athans, “Dis- tributed asynchronous deterministic and stochastic gradient optimization algorithms,” Automatic Con- trol, IEEE Transactions on, vol. 31, no. 9, pp. 803– 812, Sep. 1986. [16] S. Li, G. Oikonomou, T. Tryfonas, T. Chen, and L. Xu, “A distributed consensus algorithm for decision-making in service-oriented internet of things,” Transactions on Industrial Informatics, vol. 10, no. 2, pp. 1461–1468, 2014. [Online]. Avail- able: http://ieeexplore.ieee.org/xpl/articleDetails.jsp? arnumber=6740862 [17] A. Narayanan, J. Bonneau, E. Felten, A. Miller, and S. Goldfeder, Bitcoin and Cryptocurrency Technolo- gies: A Comprehensive Introduction. Princeton, NJ, USA: Princeton University Press, 2016. [18] I. F. Akyildiz, B. F. Lo, and R. Balakrishnan, “Cooperative spectrum sensing in cognitive radio networks: A survey,” Phys. Commun., vol. 4, no. 1, pp. 40–62, Mar. 2011. [Online]. Available: http://dx.doi.org/10.1016/j.phycom.2010.12.003 [19] G. Xiong and S. Kishore, “Consensus-based dis- tributed detection algorithm in wireless ad hoc net- works,” in Signal Processing and Communication Systems, 2009. ICSPCS 2009. 3rd International Con- ference on, Sept 2009, pp. 1–6. [20] K. Avrachenkov, M. E. Chamie, and G. Neglia, “A local average consensus algorithm for wireless sensor networks,” in 2011 International Conference on Dis- tributed Computing in Sensor Systems and Workshops (DCOSS), June 2011, pp. 1–6. [21] M. Pease, R. Shostak, and L. Lamport, “Reaching agreement in the presence of faults,” J. ACM, vol. 27, no. 2, pp. 228–234, Apr. 1980. [Online]. Available: http://doi.acm.org/10.1145/322186.322188 [22] L. Lamport, R. Shostak, and M. Pease, “The byzantine generals problem,” ACM Trans. Program. Lang. Syst., vol. 4, no. 3, pp. 382–401, Jul. 1982. [Online]. Available: http://doi.acm.org/10. 1145/357172.357176 [23] S. Liu, H. Zhu, S. Li, X. Li, C. Chen, and X. Guan, “An adaptive deviation-tolerant secure scheme for distributed cooperative spectrum sensing,” in 2012 IEEE Global Communications Conference, GLOBECOM 2012, Anaheim, CA, USA, December 3-7, 2012, 2012, pp. 603–608. [Online]. Available: http://dx.doi.org/10.1109/GLOCOM.2012.6503179 [24] Q. Yan, M. Li, T. Jiang, W. Lou, and Y. T. Hou, “Vul- nerability and protection for distributed consensus- based spectrum sensing in cognitive radio networks,” in INFOCOM, 2012 Proceedings IEEE. IEEE, 2012, pp. 900–908. [25] W. Zeng and M.-Y. Chow, “A reputation-based secure distributed control methodology in D-NCS.” IEEE Trans. Industrial Electronics, vol. 61, no. 11, pp. 6294–6303, 2014. [Online]. Available: http://dblp. uni-trier.de/db/journals/tie/tie61.html#ZengC14 [26] F. Pasqualetti, A. Bicchi, and F. Bullo, “Consensus computation in unreliable networks: A system the- oretic approach,” IEEE Transactions on Automatic Control, vol. 57, no. 1, pp. 90 – 104, Jan. 2012. [27] ——, “Distributed intrusion detection for secure con- sensus computations,” in Decision and Control, 2007 46th IEEE Conference on, Dec 2007, pp. 5594–5599. [28] S. Sundaram and C. N. Hadjicostis, “Distributed function calculation via linear iterative strategies in the presence of malicious agents,” IEEE Transactions on Automatic Control, vol. 56, no. 7, pp. 1495–1508, July 2011. [29] L. Xiao, S. Boyd, and S.-J. Kim, “Distributed av- erage consensus with least-mean-square deviation,” Journal of Parallel and Distributed Computing, vol. 67, no. 1, pp. 33 – 46, 2007. [Online]. Avail- able: http://www.sciencedirect.com/science/article/ pii/S0743731506001808 [30] M. Zhu and S. Martínez, “Discrete-time dy- namic average consensus,” Automatica, vol. 46, no. 2, pp. 322 – 329, 2010. [Online]. Avail- able: http://www.sciencedirect.com/science/article/ pii/S0005109809004828 [31] R. Olfati-Saber and R. M. Murray, “Consensus problems in networks of agents with switching topology and time-delays,” Automatic Control, IEEE Transactions on, vol. 49, no. 9, pp. 1520–1533, Sep. 2004. [Online]. Available: http://dx.doi.org/10.1109/ tac.2004.834113 [32] L. Xiao, S. Boyd, and S. Lall, “Distributed average consensus with time-varying metropolis weights,” 2006, unpublished. [Online]. Available: http://web. stanford.edu/~boyd/papers/pdf/avg_metropolis.pdf [33] L. Xiao, S. Boyd, and S. Lall, “A space-time diffusion scheme for peer-to-peer least-squares estimation,” in Proceedings of the Fifth International Conference Defense Strategies against Byzantine Attacks . . . Informatica 41 (2017) 193–207 207 on Information Processing in Sensor Networks, IPSN 2006, Nashville, Tennessee, USA, April 19- 21, 2006, 2006, pp. 168–176. [Online]. Available: http://doi.acm.org/10.1145/1127777.1127806 [34] L. Xiao, S. Boyd, and S. Lall, “A scheme for robust distributed sensor fusion based on average consensus,” in Proceedings of the 4th International Symposium on Information Processing in Sensor Networks, ser. IPSN ’05. Piscataway, NJ, USA: IEEE Press, 2005. [Online]. Available: http://dl.acm.org/citation.cfm?id=1147685.1147698 [35] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed analysis of the kdd cup 99 data set,” in Proceedings of the Second IEEE International Conference on Computational Intelligence for Security and Defense Applica- tions, ser. CISDA’09. Piscataway, NJ, USA: IEEE Press, 2009, pp. 53–58. [Online]. Available: http://dl.acm.org/citation.cfm?id=1736481.1736489 [36] L. Xiao, S. Boyd, and S.-J. Kim, “Distributed average consensus with least-mean-square deviation,” J. Parallel Distrib. Comput., vol. 67, no. 1, pp. 33–46, Jan. 2007. [Online]. Available: http: //dx.doi.org/10.1016/j.jpdc.2006.08.010 [37] A. Olshevsky and J. N. Tsitsiklis, “Convergence speed in distributed consensus and averaging,” SIAM J. Control Optim., vol. 48, no. 1, pp. 33–55, Feb. 2009. [Online]. Available: http: //dx.doi.org/10.1137/060678324 [38] L. Xiao and S. Boyd, “Fast linear iterations for distributed averaging,” Systems and Control Letters, vol. 53, pp. 65–78, 2003. [39] S. Kar and J. M. F. Moura, “Topology for global av- erage consensus,” October 2006, pp. 276–280. [40] B. Kailkhura, S. Brahma, and P. K. Varshney, “Data falsification attacks on consensus-based detection systems,” IEEE Transactions on Signal and Informa- tion Processing over Networks, vol. 3, no. 1, pp. 145– 158, March 2017. [41] W. Ben-Ameur, P. Bianchi, and J. Jakubowicz, “Robust average consensus using total variation gossip algorithm,” in 6th International ICST Con- ference on Performance Evaluation Methodologies and Tools, Cargese, Corsica, France, October 9- 12, 2012, 2012, pp. 99–106. [Online]. Available: http://dx.doi.org/10.4108/valuetools.2012.250316 [42] S. Mi, H. Han, C. Chen, J. Yan, and X. Guan, “A secure scheme for distributed consensus estimation against data falsification in heterogeneous wireless sensor networks,” Sensors, vol. 16, no. 2, p. 252, 2016. [Online]. Available: http://www.mdpi.com/ 1424-8220/16/2/252 [43] V. P. Illiano and E. C. Lupu, “Detecting malicious data injections in wireless sensor networks: A survey,” ACM Comput. Surv., vol. 48, no. 2, pp. 24:1–24:33, Oct. 2015. [Online]. Available: http://doi.acm.org/10.1145/2818184 [44] R. Isermann, “Model-based fault-detection and diag- nosis - status and applications,” Annual Reviews in Control, vol. 29, pp. 71–85, 2005. [45] J. Chen, J. R. Patton, and H.-Y. Zhang, “Design of un- known input observers and robust fault detection fil- ters,” International Journal of Control, vol. 63, no. 1, pp. 85–105, 1996. [46] Z. A. Biron, P. Pisu, and B. HomChaudhuri, “Observer design based cyber security for cyber physical systems,” in Proceedings of the 10th Annual Cyber and Information Security Research Conference, ser. CISR ’15. New York, NY, USA: ACM, 2015, pp. 6:1–6:6. [Online]. Available: http://doi.acm.org/10.1145/2746266.2746272 [47] D. Ding, Z. Wang, D. W. C. Ho, and G. Wei, “Observer-based event-triggering consensus control for multiagent systems with lossy sensors and cyber- attacks,” IEEE Transactions on Cybernetics, vol. PP, no. 99, pp. 1–12, 2016. [48] F. Pasqualetti, R. Carli, A. Bicchi, and F. Bullo, “Iden- tifying cyber attacks via local model information,” in International Conference on Decision and Control - CDC 2010, Atlanta, USA, December 2010, pp. 5961 – 5966. [49] A. Teixeira, H. Sandberg, and K. H. Johansson, “Net- worked control systems under cyber attacks with ap- plications to power networks,” in Proceedings of the 2010 American Control Conference, June 2010, pp. 3690–3696. [50] L. Negash, S. Kim, and H. Choi, “Distributed unknown-input-observers for cyber attack detection and isolation in formation flying uavs,” CoRR, vol. abs/1701.06325, 2017. [Online]. Available: http://arxiv.org/abs/1701.06325 208 Informatica 41 (2017) 193–207 M. Toulouse et al. Informatica 41 (2017) 209–219 209 Individual Classification: an Ontological Fuzzy Based Approach Asma Djellal Preparatory School of Economics, Business and Management Sciences of Constantine, Algeria LIRE Laboratory, Constantine 2 - Abdelhamid Mehri -University, Constantine, Algeria E-mail: asmadjellal@gmail.com Zizette Boufaida LIRE Laboratory, Constantine 2 - Abdelhamid Mehri -University, Constantine, Algeria E-mail: zizette.boufaida@univ-constantine2.dz Keywords: fuzzy logic, fuzzy ontology, classification reasoning, individual classification, fuzzy ontologies realization Received: July 25, 2016 Recently, serval reasoners for very expressive fuzzy Description Logics have been implemented. However, in some cases, applications do not require all the reasoner services and would benefit from the efficiency of just certain reasoning tasks. To this scope, we are interested in the individual fuzzy classification issue. In fact, decision-making applications for real world domain is often based on classifying new situations into fuzzy categories. Therefore, we propose Fuzzy Realizer to offer an effective classification even with imprecise/vague or incomplete knowledge so that appropriate decision can be made. Fuzzy Realizer is a Java prototype implementation for realizing fuzzy ontologies. It supports the well-known fuzzy description logic Z SHOIN (D). It allows (i) fuzzy concrete domains, (ii) modified and (iii) weighted concepts. It is able to (i) classify new individuals, even with incomplete descriptions, (ii) provide a more human-oriented classification by hiding the crisp boundaries between different fuzzy categories and (iii) to populate fuzzy ontologies which address an aspect of fuzzy ontologies evolution, a topic which is rarely discussed. Povzetek: Razvit je postopek za individualno klasifikacijo s pomočjo mehke logike. 1 Introduction Crisp ontologies, based on first-order logic formalisms, are not suitable for handling imperfect knowledge. Knowledge imperfection, manifested by incomplete, vague or imprecise notions, is inherent to several real- world domains, and this problem has therefore attracted the attention of many research communities [21, 22, 26, 28, 29]. Several approaches have incorporated fuzzy logic into ontology languages and description logics (DLs) to build so-called fuzzy ontologies. Indeed, a number of reasoners for very expressive fuzzy DLs have been implemented [31], including FiRE [25], FuzzyDL [3, 6] and DeLorean [2]. Moreover, a number of optimization techniques have been proposed recently for improving reasoning efficiency for very expressive fuzzy DLs [5, 24]. However, in some cases, applications do not require all the reasoner services and would benefit from the efficiency of just certain reasoning tasks. To this scope, we have been interested in the fuzzy ontologies realization issue. Realizing fuzzy ontologies with new individuals is a very important reasoning task. Using this reasoning task, several real world domains can benefit from affective decision-making applications. Indeed, in a domain like e- health, doctors always classify their patients into fuzzy categories. When referring to a patient’s fever, for example, if we have a body temperature of 38.5°, it will be stated that the patient has a “high” fever. However, a temperature of 38° will present a “high” fever, but also it can be stated that it is an “average” fever. A similar classification can be used in industry where Industrial Process Control Systems collect data, such as temperature and pressure of gas and oil pipes, for example, to be classified as safe situations or not. Based on this classification appropriate decisions can be made. Classification is the main reasoning mechanism for systems based on class/instance models. It is one of the most powerful and fundamental human inference mechanisms. It maintains the stability of the knowledge base in the presence of new knowledge, by connecting each knowledge to its class. However, since we are handling imperfect knowledge, giving exact definitions of class boundaries seems to be a very difficult, perhaps even impossible, task. Therefore, we have integrated fuzzy logic with classification to enable the attachment of an individual to several fuzzy classes. Such attachment makes the sharp borders between classes disappear, which better reflects reality and allows a more human-oriented modelling process. Having these ideas in mind, we propose a fuzzy-based approach for realizing fuzzy ontologies by classifying new individuals and connecting them to their most specialized concepts. Based on this classification operators may take the appropriate decisions. With our approach, two features of knowledge imperfection can be handled: vagueness/imprecision and incompleteness. Indeed, based on a fuzzy classification algorithm, the proposed reasoning service can classify new individuals, even with incomplete description. To validate our ideas, we have 210 Informatica 41 (2017) 209–219 A. Djellal et al. implemented this algorithm in what we call Fuzzy Realizer. It is a Java prototype implementation supporting the fuzzy DL SHOIN (D) under Zadeh semantics (Z SHOIN (D)). It allows (i) fuzzy concrete domains, (ii) modified and (iii) weighted concepts. The underling key of Fuzzy Realizer is that (i) it can classify new individuals, even though we may lack information about them, (ii) it provides a more human- oriented classification process by assigning an individual to serval fuzzy concepts with different membership degrees. Finally, (iii) it can populate fuzzy ontologies which address an aspect of fuzzy ontologies evolution, a topic which is rarely discussed. Indeed, ever since the development of ontologies, especially from large text corpuses, became a well-understood problem [23], reconstruction is always preferred to an evolutionary process. In fact, the evolution problem is challenging [33] and need to be analysed from different point of views, thus, the present paper addresses the individual classification issue by providing a realization service for fuzzy ontologies. The remainder of this paper is organized as follows. Section 2 presents some preliminaries that will be used in the rest of the paper, namely, fuzzy logic and classification reasoning mechanism. Section 3 reviews some related works and situates our work in that context. Section 4 discusses the proposed fuzzy realization algorithm then, an extension of this approach, namely a fuzzy relocation process will be presented in Section 5. To validate our ideas, we present in Section 6 Fuzzy Realizer. Finally, Section 7 concludes the paper with ideas for future research. 2 Preliminaries This section describes some background material regarding (i) fuzzy logic and its use for representing imperfect knowledge, and (ii) the classification reasoning mechanism which enables their classification. 2.1 Fuzzy logic and fuzzy ontology Fuzzy logic was designed to solve the problem of vague/fuzzy and imprecise knowledge representation. It was introduced by L. A. Zadeh in the mid-1960s as an extension of Boolean logic [34]. In classical set theory, there are two possibilities: elements either belong to a set or they do not. This theory does not consider many situations that are frequently encountered in everyday life, where imprecision is manifested by terms like high, young, hot and the like. Fuzzy logic, based on fuzzy set theory, is designed to consider this kind of situation. It is based on the notion of partial membership, where each element belongs partially or gradually to defined fuzzy subsets. Definition. Let X be a set of elements. A fuzzy subset A of X is defined by a function called the membership function and is denoted as 𝝁𝑨(𝒙). 𝝁𝑨(𝒙) is a mapping which takes any value from the real interval [0, 1]: 𝜇𝐴(𝑥): 𝑥 → [0, 1], 𝑥 ∈ 𝐴 The crisp set operators negation, intersection and union are extended to fuzzy subsets and performed by fuzzy negation, t-norm and s-norm functions, respectively, so that one can form different fuzzy logics. The most widely used one is Zadeh fuzzy logic, known as Zadeh Semantics [4]. It is a combination of Gödel conjunction (tG) and disjunction (SG) (tG = min (a, b) and SG = max (a, b)) and Łukasiewicz negation (NL) (NL = 1 – a). Fuzzy calculus is a vast and very flexible research field; indeed, it is used in many domains, one of them is fuzzy ontologies development [5, 1, 14]. Fuzzy ontologies extend crisp ones by interpreting concepts and roles as fuzzy sets of individuals and binary relations respectively. Unlike crisp ontologies which allow an element to be described or not, {0, 1}, by each concept in the ontology, fuzzy ontologies associate an element to each concept using a membership degree in the interval [0, 1]. Such association allows the attachment of each element to different concepts with different membership degrees. Consequently, fuzzy ontologies have a more flexible representation capability than crisp ones. In fact, vague notions, manifested by fuzzy terms like high_temperature, very_close_to and the like, are quite comment in human language, and they can be represented by means of fuzzy ontologies elements using different constructs [29]; the most important of these are: Explicit fuzzy concepts. Represented by means of fuzzy membership functions using fuzzy concrete domains such High_temperature which is a fuzzy concept defined with the fuzzy concrete domain High with its Right-Shoulder membership function, High (37, 38.5) as: High_temperature ≡ temperature ⨅  Degree.High Modified concepts. Fuzzy modifiers, such as very or slightly, are defined by functions fm: [0, 1] [0, 1], applied to change membership functions. For instance, Very_high_temperature is a fuzzy modified concept defined with Very as a fuzzy modifier having the function fVery (x) = x2 as: Very_high_temperature ≡ temperature ⨅  Degree.Very (High) Weighted concepts. Sometimes we want to express the importance of concepts representing preferences or priorities, such as 0.8 (C). These concepts, called fuzzy weighted concepts, are defined as follows: D ≡ w (C) / w ∊ [0, 1] For the rest of the paper, m and fm are used to represent fuzzy modifiers and their membership functions, while w (w ∊ [0, 1]) is used to express weights of concepts. In this section, we have provided some preliminaries regarding fuzzy ontologies by introducing the basic concepts which are involved. For a more in-depth presentation, we refer the reader to [30]. 2.2 Classification reasoning mechanism Classification is the fundamental inference mechanism for object-based representations. Indeed, structuring knowledge into classes, subclasses and instances promotes the use of classification to retrieve implicit knowledge. To this end, classification can be used to (i) categorize a set of objects into category graphs, (ii) add a new category to Individual Classification: an Ontological... Informatica 41 (2017) 209–219 211 an already created graph or (iii) to add a new object to its most specialized categories in the created graph [18]. This process, also called individual classification, refers to ontology realization. It is used to retain the stability of an already created knowledge base in the presence of a new individual by connecting it to the most specialized concepts it belongs to (see, Figure 1). Classification of individuals consists of precisely selecting their belonging classes. Therefore, different classes have to be well separated. However, giving exact definitions of class boundaries is a very difficult, perhaps even impossible, task. The difficulty comes from the vagueness of the modelled knowledge. To address this problem, we have integrated fuzzy logic with classification to enable the use of non-numerical values which allow non-sharp definitions of class boundaries. Fuzzy classification [16, 32] is the process of grouping elements into fuzzy sets. The membership of these elements to each fuzzy set is not full but partial to some degree. The main difference between crisp and fuzzy classification is that in fuzzy classification, an element can belong to several fuzzy classes with different membership degrees. Such membership makes the sharp borders between classes disappear, which better reflects reality and allows a more human-oriented modelling process. Figure 1: An individual classification example. 3 Related work Work related to our research context explores two research fields: (i) handling imperfect knowledge and (ii) classification reasoning mechanisms. 3.1 Handling Imperfect knowledge It has been widely pointed out that crisp ontologies are not suitable to handle imperfect knowledge. Thus, many fuzzy approaches have been proposed to cover this limitation [1, 3, 4, 7, 13, 27]. As a result, a few methodology for developing fuzzy ontologies have been proposed and a number of fuzzy extensions of DL have been used. However, like crisp ontologies, the success of fuzzy ones depends on the availability of effective software allowing their exploitation. Consequently, the reasoning task has been a very interesting topic for many researchers. DeLorean (DEscription LOgic REasoner with vAgueNess) [2] was the first reasoner that supported a fuzzy extension of the DL SROIQ. As far as we know, DeLorean is the only reasoner that supports fuzzy OWL2. Based on Zadeh Semantics, it represents fuzzy operators and reduces the resulting fuzzy Z SROIQ knowledge base to a crisp one by creating new crisp concepts and roles representing α-cuts [20] of original fuzzy ones. Other quite similar studies have proposed reasoners for expressive fuzzy DLs. For instance, Fire implements a tableau algorithm for fuzzy SHIN restricted to Zadeh Semantics [25]. YADLR is a Prolog implementation based on linear programming [17]. It supports a fuzzy extension of ALCOQ under Łukasiewicz and Zadeh fuzzy logics and allows variables as degrees of truth. In order to benefit from the full expressivity of a less expressive language and then guarantee the reasoning efficiency, LiFR was proposed [31]. It is a lightweight fuzzy reasoner oriented to mobile devices and the supported language is f-DLP. It allows fuzzy concept assertions and weighted concepts. FuzzyDL [3, 6] is an important fuzzy reasoner supporting fuzzy extensions of SHIF (D) under Zadeh, Łukasiewicz and classical semantics. It was successfully used in some practical applications. Its interesting features are aggregation of fuzzy concepts, explicit fuzzy set membership functions and fuzzy modifiers. Like all these cited works, we were interested in reasoning with imperfect knowledge using fuzzy logic. However, unlike them, we have been interested in just one reasoning task to propose a fuzzy ontologies realization service as much complete and efficient as possible. As far as we know, no other work exploits the fuzzy classification mechanism with fuzzy ontologies especially with incomplete individuals. On the other hand, there have been some previous attempts to combine this reasoning mechanism with fuzzy logic in other research fields, such as pattern recognition and data mining. 3.2 Classification reasoning mechanism fCQL (Fuzzy Classification Query Language) is a toolkit for classification, analysis and decision support applied in the marketing domain of a telecom company [19, 32]. Meier et al. claimed that ‘Using linguistic terms and variables hides the complexity of the domain and permits a more intuitive and human-oriented querying process in different application domains’ [19, pp. 586-587]. Therefore, they exploited the advantages of fuzzy logic to reduce the business data complexity and extracts valuable hidden information through fuzzy classification. fCQL allows formulating fuzzy queries which are then transformed into SQL statements. This approach benefits from fuzzy logic in classification and querying. However, its main disadvantage is that it is a data oriented approach, thus semantic retrieval of resources is not supported. A closer approach to ours is [12], which defines a semi- automated musical genre classification mechanism using an ontological representation. Fuzzy classification was used to allow the classification of music resources into musical genres based on a score provided by the resource composer expressing its viewpoint. Indeed, in music classification, different users are not required to agree about the classification of a specific music resource in the same musical genre. In this approach, fuzzy classification 212 Informatica 41 (2017) 209–219 A. Djellal et al. is flexible regarding the different interpretations of music genres. However, the consideration of vagueness is quite limited because (i) music resources are represented by crisp ontologies, (ii) fuzzy logic is restricted to express users’ viewpoints and (iii) the membership degree is not calculated based on membership functions but is instead given by the user. Finally, in this approach, (iv) knowledge imperfection was considered without reference to knowledge incompleteness, which is an important feature of fuzzy knowledge. 4 A fuzzy realization algorithm Using an illustrative example, we will study and improve upon a fuzzy realization algorithm proposed in a previous work [9]. The following algorithm has been extended and improved in order to accelerate the classification process. Table 1: Fuzzy realization algorithm The proposed algorithm allows the realization of fuzzy ontologies and results to evolved ones in which the new individual A will be attached to its most specialized concepts. First, the user provides the necessary knowledge (line 1) to start the classification loop. This loop consists of exploring the hierarchy and matching the current concept C* with A (line 4), starting at the hierarchy root TOP (line 2). The Matching procedure verifies A ’membership in C* and, if A belongs to C*, the concept will be marked with a label and a membership degree. To accelerate the classification, the Marks_Propagation procedure (line 5) propagates marks to different concepts related to C* based on some logical rules. The next concept to be matched with A is chosen by the Next_Concept function (line 6). If there are no more unmarked concepts, Next_Concept returns null which terminate the classification. Illustrative example. In the following sections, we will study the proposed algorithm for the following illustrative example; it is an excerpt from a simple fuzzy knowledge base about persons: TBox [Ax 1] Person ⊑⊤ [Ax 2] Male ⊑⊤ [Ax 3] Female ⊑⊤ [Ax 4] Male ≡ ⌐ Female [Ax 5] Man ≡ Person ⨅ Male [Ax 6] Woman ≡ Person ⨅ Female [Ax 7] Young ≡ Person ⨅ ∃ HasAge.YoungAge [Ax 8] Adult ≡ Person ⨅ ∃ HasAge.AdultAge [Ax 9] Teacher ≡ Adult ⨅ ∃ HasFunction.Teacher [Ax10]VeryYoung ≡Person⨅∃HasAge.very (YoungAge) [FCP 1] YoungAge (x) = Left-shoulder (10, 30) [FCP 2] AdultAge (x) = Trapezoïdal (30, 35, 50, 60) [FCP 3] Very (x) = x2 ABox [FCA 1] 〈Tom: Person = 1〉 [FCA 2] 〈Tom: Male = 1〉 [FCA 3] 〈Lina: Person =1〉 [FCA 4] 〈Lina: Female = 1〉 Person, Male and Female are defined as atomic concepts. Axioms [Ax 5] and [Ax 6] define crisp concepts, while [Ax 7]–[Ax 10] describe some fuzzy ones. [FCP 1] and [FCP 2] concern fuzzy concrete predicates YoungAge and AdultAge; they indicate the degree to which a person is young or adult, respectively, using left-shoulder and trapezoidal membership functions. [FCP 3] defines the fuzzy modifier Very. Finally, the ABox contains some fuzzy concept assertions to define two individuals: Tom and Lina. 4.1 Initialization ( ) procedure To start the classification loop, we need to collect some information about the new individual in the form of (attribute, value) pairs. The user must provide as much knowledge as possible so that the algorithm can classify the individual as precisely as possible in the hierarchy. If the user do not have enough information, the Initialization procedure accepts the value ‘Unknown’. Consequently, the proposed algorithm can classify incomplete individuals. Definition. Let A be an individual defined by its description in the form of a set of (attribute, value) pairs. If we are missing information about an attribute of A, then it is incomplete. Formally: A is incomplete  A = {(Att1, Val1)… (Attn, Valn)} and  i / (Atti, Unknown) ∊ A. Example 1. Consider our ABox, having the individual Tom with its description: Tom = {(Name, Tom), (Age, 33), (Size, 1.7), (Function, Unknown) …}. Since we are missing information about the attribute “Function”, Tom is incomplete. 4.2 Matching (C*, A) procedure Matching is the algorithm key procedure. It has the role of checking an individual’s membership in the current concept based on a membership function. Classical, two- valued, membership function has been successfully applied to consider complete and precise knowledge, for which we can exactly define their belonging classes. However, it seems to be inappropriate to be used for managing fuzzy knowledge bases, in which we handle imprecise and incomplete knowledge. To cover this limitation, we have chosen the membership function with three values. The scope of this function is extended to accept the value possible, if we do not have sufficient information for affirming or denying an instance’s membership in a given class. This function can be described as follows, given that x is an instance and C is a class with the membership function C(x): Algorithm1. Fuzzy realization algorithm Input: H: Fuzzy concepts hierarchy (Fuzzy Ontology) A: New individual Output: Evolved fuzzy ontology 1. Initialization ( ); 2. C*:= TOP (H); 3. While (not empty (C*)) do 4. Matching (C*, A); 5. Marks-Propagation (C*, label, degree) ; 6. C*:= Next-Concept (C* ); 7. End while Individual Classification: an Ontological... Informatica 41 (2017) 209–219 213 𝐶(𝑥) = { 𝑠𝑢𝑟𝑒 𝑖𝑓 𝑥 ∈ 𝐶 𝑖𝑚𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑖𝑓 𝑥 ∉ C 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 Based on this function, Matching procedure marks the current concept C* with a label, indicating whether it is sure, possible or impossible for the new individual (see Figure 2). Since there is no full membership in fuzzy ontologies, C* will be marked with another mark expressing the degree of this membership. In sum, Matching procedure generates the following output: ⟨C*, label, degree⟩ where label ∊ {S, P, I} and degree ∊ [0, 1], if the new individual belongs to C* (that is, label = sure), or null if there is no membership. For the rest of the paper, S, P and I will be used to represent, respectively, the marks Sure, Possible and Impossible:  ⟨C*, S, d⟩ (A is C* with a truth-value of d): if A’s value for each attribute satisfies the constraints of C*. This membership can be determined only if A is complete.  ⟨C*, I, null ⟩ (A is not C*): if A’s value for at least one attribute does not satisfy the constraints in C*. In this case, we do not consider whether A is incomplete.  ⟨C*, P, null ⟩ (A may be C*): if A is incomplete and its values do not stand in contradiction with C*. Figure 2: Fuzzy classification of an individual based on concept marking. Table 2: Matching (C*, A) procedure. If C* includes some attributes that are not defined in the description of A, then Matching asks the user for values for these attributes (lines 2–4). Using the function Get_degree, the membership of A in C* is computed (line 5). If there is no membership (Degree = 0), the matching stops and C* will be marked Impossible (lines 12–14). If all constraints are satisfied (that is, Degree > 0), two cases are considered:  A is incomplete: the matching stops and C* will be marked Possible (lines 6–8).  A is complete: C* will be marked Sure to some 'Degree' (lines 9–11). In order to mark C*, the function Get_degree calculates A’s degree of membership in C*. C* can be described based on several logical expressions: concept conjunction, modified concept, explicit fuzzy concept etc. Based on the description of C* and under Zadeh Semantics, the Get_degree function proceeds according to the following cases:  Concept conjunction C* ≡ C1⨅ …⨅Cn : Degree (C*, A) = min (Degree (Ci, A))/ i=1..n.  Concept disjunction C* ≡ C1⨆ …⨆Cn : Degree (C*, A) = max (Degree (Ci, A)) / i=1..n.  Concept negation C* ≡  C: Degree (C*, A) = 1- Degree (C, A)  Fuzzy modified concept C* ≡ m(C): Degree (C*, A) = fm ( Degree(C, A)).  Fuzzy weighted concept C* ≡ w (C): Degree (C*, A) = w * Degree(C, A)  Explicit fuzzy concept C* ≡  Attribute.Range, where Range is a fuzzy predicate: Degree (C*, A) = fRange(A.Attribute). eg.  Age.YoungAge, results to fYoung (A.Age)  Limited existential quantification: C* ≡ R.C. In this case, the function returns the maximum degree of mumbership of all individuals (Ai) related by the role R in the concept C. Degree(C*, A)= max (Degree(C, Ai)).  Value restriction: C* ≡ ∀ R.C. The function returns the minimum degree of mumbership of all individuals (Ai) related by the role R in the concept C: Degree (C*, A) = min (Degree(C, Ai)).  Max cardinality C* ≡ ≥ n R.C: If | Degree (C, Ai) > 0 | >= n then Degree (C*, A) = 1 else 0.  Min cardinality C* ≡ ≤ n R.C: If | Degree (C, Ai) > 0 | <= n then Degree (C*, A) = 1 else 0. Example 2. Assuming that the individual Lina in our earlier illustrative example is a 12 years old girl. Since Person is already marked sure (⟨Person, S, 1⟩). Based on [Ax 7] and [CPF 1], Get_degree (Young, Lina) returns 0.9, Procedure1. Matching (C*, A) Input: C*: Current concept A={(Att1,Val1),…,(Attn,Valn)}:new individual Output: ⟨C*, label, degree⟩ Degree : real; 1 Begin 2 If (  Vali = "") then 3 Request the user; 4 End if 5 Degree := Get_degree (C*, A); 6 If (Degree > 0) then 7 If (∃ Vali = " Unknown") then 8 Mark (C*, P, null); 9 Else 10 Mark (C*, S, Degree); 11 End if 12 Else 13 Mark (C*, I, null); 14 End if 15 End. 214 Informatica 41 (2017) 209–219 A. Djellal et al. and thus, the fuzzy concept Young will be marked ⟨Young, S, 0.9⟩. Example 3. Consider the individual Tom in Example 1. Based on its description, [Ax 8] and [FCP 2] and Get_degree (Adult, Tom) = 0.6, Matching will mark this concept as ⟨Adult, S, 0.6⟩. Moreover, based on [Ax 9], Teacher will be marked as ⟨Teacher, P, null⟩. 4.3 Marks-propagation (C*, label, degree) procedure In order to accelerate the classification process, Marks- propagation minimizes the number of concepts to be verified by Matching procedure. It is a recursive procedure that propagates marks to concepts related to C*, and to their related concepts. This procedure propagates marks based on the mark of C* according to certain rules and under Zadeh Semantics. For instance, according to [R.1], all synonymous of C* will be marked sure to some degree (d). Then, each of these synonymous (D) will be the new input of Marks-propagation, and then this procedure starts to propagate marks to all concepts related to it (D), and so on until there will be no rule to be applied or no unmarked related concept to be marked. [R.1] If ⟨C*, S, d⟩, then D, C*  D ⟨D, S, d⟩. [R.2] If ⟨C*, I, null⟩, then D, C*  D ⟨D, I, null⟩. [R.3] If ⟨C*, S, d⟩, D, C*   D ⟨D, I, null⟩. [R.4] If ⟨C*, I, null⟩), then D, C*   D ⟨D, S, d⟩ / d = Get_degree (D, A). In this case, we can confirm the membership of A in D. However, the degree of this membership must be computed by Get_degree (D, A). [R.5] If ⟨C*, S, d ⟩, then D, C* ⊑ D, ⟨D, S, ≥ d ⟩. [R.6] If ⟨C*, I, null⟩, D,D⊑C*,⟨D, I, null⟩. [R.7] If ⟨C*, P, null⟩, then D, D ⊑ C*, ⟨D, label, null⟩ / label  {S}. This rule can be used to check some consistency problems. Indeed, if C* is possible for A, then A is incomplete for C* and for all of its specific concepts. [R.8] If ⟨C*, S, d⟩, thenD, D≡ m(C*), ⟨D, S, fm (d) ⟩. [R.9] If ⟨C*, S, d⟩, D, D ≡ w(C*), then ⟨D, S, w*d⟩. Supposition1. D is defined by a concept conjunction including C* as D ≡ C* ⨅ C1⨅ … ⨅ Cn. [R.10] If ⟨C*,I, null⟩, then ⟨D, I, null⟩. [R.11] If ⟨Ci, S, di⟩, and ⟨C*, S, d⟩, then ⟨D, S, deg⟩ / deg = min (d, di) / i=1... n. [R.12] If ⟨D, I, null⟩, ⟨C*, S, di⟩ and  j ∊ {1...n} ⟨Cj, "", ""⟩ (which means that Cj is unmarked),  i ∊ {1...n} / i ≠ j ⟨Ci, S, di⟩, then ⟨Cj, I, null⟩. Supposition2. D is defined as a concept disjunction including C* as D ≡ C* ⨆ C1 ⨆ … ⨆ Cn. [R.13] If ⟨C*, S, d⟩, then ⟨D, S, deg⟩ / deg = max (d, di) / i = 1…n. [R.14] If ⟨Ci, I, null⟩/i = 1…n and ⟨C*, I, null⟩, then ⟨D, I, null⟩. [R.15] If ⟨D, S, d⟩, ⟨C*, I, null⟩ and  j ∊ {1...n} ⟨Cj, "", ""⟩,  i ∊ {1...n} / i ≠ j ⟨Ci, I, null⟩, then ⟨Cj, S, d⟩. Example 4. Recall the individual Tom from our illustrative example. Since Person and Male are already marked Sure ([FCA 1] and [FCA 2]), based on [Ax 5] and applying [R. 11], Marks-Propagation can propagate the mark Sure to the concept Man as: ⟨Man, S, 1⟩. It can also propagate the mark Impossible to Female based on [Ax 4] and applying [R. 3]. Moreover, Woman will be impossible for Tom based on [Ax 6] and applying [R. 10]. Example 5. During the classification of Lina in Example 2, we have generated the result ⟨Young, S, 0.9⟩. Thus, based on [Ax 11] and using [R. 8], Marks-Propagation can mark the modified concept VeryYoung as ⟨VeryYoung, S, 0.81⟩. Consider same concepts and the individual Tom. If Matching (Young, Tom) results to ⟨Young, I, null⟩, then the same mark will be propagated to VeryYoung as ⟨VeryYoung, I, null⟩. 4.4 Next-Concept (C*) function The aim of this function is to select a new unmarked concept to be the next current concept, by traversing the hierarchy of fuzzy concepts. We use the breadth-first search traversal, which is one of the important graph traversal techniques, to explore the hierarchy graph. Using this technique, Next-Concept selects the next unmarked neighbouring concept of C*. After testing all the unmarked neighbours, the function moves to the next level of the hierarchy and goes from left to right to select a new target concept. If there are no more unmarked concepts, Next_Concept returns null. In our work, we were inspired by the multi-viewpoints classification algorithm proposed in [18], in which classification was used in an object-oriented multi- viewpoints representation system named TROPES. This algorithm provides multi-viewpoints instance classifications in which an instance can be classified in one or more viewpoints. This work was extended to consider individuals reclassification in multi-viewpoints ontologies [11]. These multi-viewpoints classification algorithms [11, 18] are both based on the hypothesis of the exclusiveness of sister classes, which assumes that classes at the same hierarchy level (called sister classes) represent mutually exclusive sets. Therefore, an individual which belongs to a class cannot belong to any of its sister classes. Unlike the cited works, our algorithm is not based on this hypothesis. Indeed, in our fuzzy ontology conceptualization, fuzzy concepts are modelled as fuzzy subsets [8]. The strength of fuzzy logic in knowledge representation lies in the intersections between fuzzy subsets, as an element can belong to several fuzzy subsets with different membership degrees. Consequently, the main advantage of fuzzy classification compared to classical one is that an element is not limited to a single class but can be assigned to several sister classes which better reflect reality. Individual Classification: an Ontological... Informatica 41 (2017) 209–219 215 Example 6. Consider the two fuzzy concepts Child and Teenager, defined by their trapezoidal membership functions (See Figure 3). Having the little girl Lina of Example 2, we can calculate these memberships: Child (Lina) = 0.66 and Teenager (Lina) = 0.33. These results dedicate that Lina is considered a Child but also a Teenager, with different membership degrees. Figure 3: Assignment of an individual to different fuzzy concepts. 5 Individual relocation: an extension of the fuzzy realization approach The proposed algorithm provides a complete and efficient realization service for fuzzy ontologies. Indeed, it can efficiently classify individuals, even incomplete ones, in their appropriate belonging concepts with their membership degrees. With this, we can ensure an evolutionary aspect of fuzzy ontologies by realizing them with new individuals. After their classification, individuals may evolve and update their knowledge. Indeed, a person changes age, address or professions. Therefore, a relocation process is necessary to evolve fuzzy ontologies. To this scope, our proposed algorithm is extensible. In fact, an extension process of the fuzzy realization algorithm may consider another aspect of fuzzy ontologies evolution in which already classified, but updated, individuals can be relocated. This process allows an individual to migrate from its current belonging concepts to new ones that satisfy its updated description [10]. Changes of an individual description may be the result of an:  Enrichment of an incomplete individual by replacing its unknown value by a concrete one,  Modification of a concrete value by a new one, or  Impoverishment and removal of a concrete value and replacing it by an unknown one. In the first two cases, we have to handle a new data. This data can satisfy the fuzzy ontology constraints, and then, results to a consistent fuzzy ontology. It can also be in contradiction with some constraints and then generates an inconsistency: Fuzzy Ontology in a consistent state. In this case, the individual belonging concepts must keep their marks as ⟨Ci*, S, di⟩. However, the individual new description may allow it to migrate to concepts that are more specific. Thus, for this first case, a simple realization process is 1 As part of a masters’ project [15]. revived to descent the evolved individual in the hierarchy starting at its belonging concepts. Fuzzy Ontology in inconsistent state. To deal with this inconsistency, a fuzzy relocation process is invoked to migrate the updated individual to its new belonging concepts. To this end, the individual is raised up in the fuzzy hierarchy, by following the path of its sure super- concepts until the first super-concept for which the new data satisfies its constraints. It should be noted that all super-concepts along the individual path (excepting the last one) must change theirs marks from ⟨Ci*, S, di⟩ to ⟨Ci*, I, null⟩. To complete the individual relocation, the updated individual must descent in fuzzy hierarchy until it reaches its new belonging concepts. Indeed, the individual updated description can satisfy other concepts that are more specific than the first consistent super-concept. Thus, starting at this concept the fuzzy realization process is evoked. The individual knowledge can evolve to an unknown value. This impoverishment will not affect the ontology consistency. However, the evolved individual becomes incomplete since the concrete value of the updated attribute has been changed with an unknown one. To handle this change, the fuzzy ontology must evolve and the updated individual must raise up, by following the path of its super-concepts until the first sure super-concept in which there is no specification for the impoverished attribute. All these super-concepts (excepting the last one) must change their marks from ⟨Ci*, S, di⟩ to ⟨Ci*, P, null⟩. Unlike the enrichment/modification, in the case of an individual impoverishment, once the ascent to the first sure super-concept is done no further descent is possible. Indeed, there is no concrete new data to be matched with more specific concepts. 6 Validation of the proposed algorithm In order to validate our ideas, we have implemented1 the proposed fuzzy realization algorithm as Fuzzy Realizer. It is a Java prototype implementation that supports a fuzzy extension of the well-known DL Z SHOIN (D). Fuzzy Realizer has a graphical interface for displaying the fuzzy ontology in the form of a coloured directed acyclic graph (DAG), in order to improve the results presentation and thereby facilitate the decision-making process (see Figure 4). Fuzzy Realizer has a modular architecture and is divided into three modules: the Parser, Visualization and Classification modules. The Parser translates the fuzzy ontology into an internal format, so that any fuzzy ontology encoded in any language (OWL, Fuzzy OWL, OWL 2, ...) can be used. The Visualization module displays the loaded ontology hierarchy in the form of a DAG. Finally, Classification, the proposed system’s key module, calculates the new individual’s membership in different (fuzzy) concepts. Once it is attached to its belonging concepts, the Visualization module displays the concept’s marks on the created DAG. Each mark is represented by a colour, which produces a coloured DAG. 216 Informatica 41 (2017) 209–219 A. Djellal et al. The colours red, orange and green are used to represent, respectively, impossible, possible and sure concepts. Figure 4: Fuzzy Realizer interface. In order to better facilitate the decision-making process, membership degrees are represented by numerical values on the coloured graph nodes and also by gradations of the colour green, ranging from light green, which represents a low membership, to dark green, which represents full membership (see Figure 5). In order to evaluate the proposed system’s performance, we carried out a range of experiments with different fuzzy ontologies, beginning with a simple Medical Checkup Fuzzy Ontology (MCFO) and then using more highly expressive and voluminous fuzzy ontologies (Fuzzy Wine2, Matchmaking3, Multi-criteria decision making4). We present the results for two of these ontologies in the following subsections. We also compared our Fuzzy Realizer and the well-known fuzzy reasoner FuzzyDL [6, 9]; this was done by replacing the Classification module with the fuzzy reasoner FuzzyDL. 6.1 Medical Checkup Fuzzy Ontology (MCFO) Uncertainty is the central critical fact about reasoning in e-health domain. Usually, doctors cannot give exact diagnoses and laboratories cannot report exact analysis results. Despite this uncertainty, doctors have to make decisions. In order to implement a decision-making 2http://users.abo.fi/rowikstr/FuzzyWineOntology/FuzzyWineOntology. owl 3http://www.umbertostraccia.it/cs/software/FuzzyOWL/ontologies/m atchmaking.owl process using medical check-up fuzzy knowledge, we developed MCFO using the Fuzzy DL SIQ (D) and Fuzzy OWL. Then, using Fuzzy Realizer, we realized it with new individuals. Table 3 represents the description of the new (incomplete) individual Tim. Table 3: description of the new individual Tim. Although it is an incomplete individual, Fuzzy Realizer was able to classify it as low as possible in the hierarchy (see Figure 5) by providing the set of its sure (to some degree) and possible fuzzy concepts. This classification cannot be done using FuzzyDL since it does not offer a service for classifying incomplete individuals. 4http://www.umbertostraccia.it/cs/software/FuzzyOWL/ontologies/m ultiCriteria.owl Attribute Value Body Temperature 37.45° Blood Sugar 1.0 g/l Body Mass Index 26.0 kg/m² Heart Pulse Unknown Respiratory Rate Unknown Diastolic Blood Pressure 70.0 mmHg Systolic Blood Pressure 100.0 mmHg Calcium Level 2.3 mmol/l Individual Classification: an Ontological... Informatica 41 (2017) 209–219 217 Figure 5: Zoom of the Classification of the incomplete individual Tim 6.2 Fuzzy Wine ontology The fuzzy extension of the well-known and highly expressive Wine ontology supporting the DL SHOIN (D) is the most voluminous open source fuzzy ontology. Thus, we used this ontology in order to test our proposed system’s performance. Despite its large size, we have been able to realize it with new individuals using our prototype. Figure 6 shows the classification of the new individual ChateauDeMeursauCru2007, described in Table 4, which is considered to be a HighUWSWine to degree 0.1 and a fully (degree = 1) HighPriceWine and TableWine. Figure 6: Realizing Fuzzy Wine. Attribute Value Price 38.6 PH 3.42 Acidity 5.8 Sugar 1.7 UWSScore 89.0 Flavor ModerateWineFlavor Maker ChateauDeMeursaultWinery Table 4: description of the new individual ChateauDeMeursauCru2007. 218 Informatica 41 (2017) 209–219 A. Djellal et al. 6.3 Discussion Although Fuzzy Realizer is considered to be a simple prototype providing a realization service for fuzzy ontologies, several series of tests show that it offers an efficient realization service since it results to correct classifications (all results are verified by domain experts). Moreover, it is capable of realizing any fuzzy ontology without any constraint on the represented knowledge’s imperfection. It is also able to realize highly expressive fuzzy ontologies even with incomplete individuals. Indeed, it was used to realize the most voluminous open source fuzzy ontology (Fuzzy Wine). Figure 7: Response time of Fuzzy Realizer modules. More importantly, its response time is within the limits of acceptability compared to the well-known fuzzy reasoner FuzzyDL as shown in Figure 7. All of these characteristics allow the proposed prototype to be tested in a real application and to handle real world knowledge. In sum, despite its simplicity, Fuzzy Realizer can be considered as an optimal solution for realizing fuzzy ontologies. In contrast, FuzzyDL is one of the most expressive and important fuzzy reasoners. However, its long runtime compared with Fuzzy Realizer and its inability to classify individuals in case we may lake information, which is a quite common problem, are weaknesses which cannot be ignored. 7 Conclusion In this paper, we have proposed a fuzzy-based approach for reasoning with imperfect ontological knowledge. As a reasoning mechanism, we have integrated fuzzy logic with the most powerful human reasoning activity, known as classification. Using fuzzy classification, we have proposed Fuzzy Realizer, a java prototype for classifying new individuals into fuzzy ontologies. It allows (i) fuzzy concrete domains, (ii) modified and (iii) weighted concepts. We have been interested in just one reasoning task to address an aspect of fuzzy ontologies evolution namely, the realization issue. The proposed prototype can realize fuzzy ontologies even with incomplete individuals. In addition, it offers a more human-oriented classification by assigning an individual to several fuzzy sister classes which hides the sharp boundaries between them. As future work, we would like to extend the proposed prototype so that it will not be limited to Zadeh semantics, but to be more flexible by supporting more fuzzy logics, for instance Łukasiewicz, Gödel or Product logics. We are also intended to implement the relocation process extension so that we can test and evaluate the proposed idea. Finally, in order to improve Fuzzy Realizer’s performance, we would like to minimize the use of the mark ‘possible’. To that end, we intend to propose a new conceptualization of concepts by dividing each concept’s set of attributes into two groups: key attributes and auxiliary ones. During the classification of an incomplete individual and if the ‘unknown’ attribute is an auxiliary one, then the current concept can be marked ‘sure’. For example, if Tom has obtained a medical diploma, then even though we may lack information about his age, his address or even his last name, we can be sure that he is a doctor. Therefore, in order to mark the concept Doctor as ‘sure’, it is not necessary to have known values for all attributes. Acknowledgements We would like to thank the anonymous referees for their valuable comments on an earlier version of this paper. References [1] Alexopoulos, P., Wallace, M., Kafentzis, K. & Askounis, D. (2012) 'IKARUS-Onto: a methodology to develop fuzzy ontologies from crisp ones', Knowledge and Information Systems, Vol. 32 No. 3, pp. 667-695. [2] Bobillo, F., Delgado, M. & Gómez-Romero, J. (2013) 'Reasoning in Fuzzy OWL 2 with DeLorean'. In Bobillo, F. et al. (Eds.), Uncertainty Reasoning for the Semantic Web II, Vol. 7123 of Lecture Notes in Computer Science, Springer-Verlag, pp. 119-138. [3] Bobillo, F. & Straccia, U. (2008) 'fuzzyDL: an expressive fuzzy description logic reasoner'. In Proceedings of the 17th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2008), IEEE Computer Society Press, pp. 923-930. [4] Bobillo, F. & Straccia, U. (2011) Fuzzy ontology representation using OWL 2. International Journal of Approximate Reasoning, Vol. 52 No. 7, pp. 1073- 1094. [5] Bobillo, F. & Straccia, U.(2013) General concept inclusion absorptions for fuzzy description logics: A First step. In Description Logics. pp. 513-525. [6] Bobillo, F., & Straccia, U. (2016). The fuzzy ontology reasoner fuzzyDL. Knowledge-Based Systems, 95, 12-34. [7] Calegari, S. & Ciucci, D. (2007) 'Fuzzy ontology, fuzzy description logics and fuzzy-owl'. In Proceedings of the 7th International Workshop on Fuzzy Logic and Applications (WILF 2007), volume 4578 of Lecture Notes in Computer Science, Springer Verlag, pp. 118–126. [8] Djellal, A., & Boufaida, Z. (2012)'Conceptualisation d’une Ontologie Floue'.In Proceedings of 9eme Individual Classification: an Ontological... Informatica 41 (2017) 209–219 219 Colloque sur l’Optimisation et les Systèmes d'Information, Tlemcen, Algeria, pp. 62-73. [9] Djellal, A., & Boufaida, Z. (2014) 'Fuzzy ontology evolution: classification of a new individual', Journal of Emerging Technologies in Web Intelligence, Vol. 6 No.1, pp. 9-14. [10] Djellal, A., & Boufaida, Z. (2016) Individual Relocation: A Fuzzy Classification Based Approach. In Model and Data Engineering: 6th International Conference, MEDI 2016, Almería, Spain, September 21-23, 2016, Proceedings (Vol. 9893, p. 209). Springer. [11] Djezzar, M., Hemam, M. & Boufaida, Z. (2012) 'Ontological re-classification of individuals: a multi- viewpoints approach'. In Proceedings of the 2nd international conference on Model and Data Engineering. LNCS 7602 Springer, pp. 91-102, Poitiers, France. [12] Ferrara, A., Ludovico, L.A., Montanelli, S., Castano, S. & Haus, G. (2006) 'A semantic web ontology for context-based classification and retrieval of music resources', ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol. 2 No. 3, pp. 177-198. [13] Gao, M. & Liu, C. (2005) 'Extending OWL by fuzzy description logic'. In Proceedings of 17th IEEE International Conference on Tools with Artificial Intelligence: (ICTAI 05), IEEE Computer Society, pp. 562-567. [14] Ghorbel, H., Bahri, A. & Bouaziz, R. (2009) 'Fuzzy Protégé for fuzzy ontology models'. In Proceedings of 11th International Protégé Conference (IPC’2009), Academic Medical Center, University of Amsterdam, Amsterdam, Netherlands. [15] Hecham, A. & Iaiche, I. E. (2015) 'Système de classification et de visualisation d'instances dans une ontologie floue'. Masters thesis, Constantine 2- Abdelhamid Mehri University, Constantine, Algeria. [16] Kaufmann, M., & Meier, A. (2009, June). An inductive fuzzy classification approach applied to individual marketing. In Fuzzy Information Processing Society. NAFIPS 2009. Annual Meeting of the North American (pp. 1-6). IEEE. [17] Konstantopoulos, S., & Charalambidis, A. (2010) Formulating description logic learning as an Inductive Logic Programming task. In proceedings of the 19th IEEE International Conference on Fuzzy Systems, IEEE Press. [18] Mariño, O. (1993) 'Raisonnement classificatoire dans une représentation à objets multi-points de vue'. PhD thesis, Joseph-Fourier-Grenoble I University, France. [19] Meier, A., Schindler, G. & Werro, N. (2008) 'Fuzzy classification on relational databases'. In J. Galindo (Ed.), Handbook of Research on Fuzzy Information Processing in Databases, Vol. 2, Idea Group Publishing, Hershey, PA, pp. 586-614. [20] Palash, D., Hrishikesh, B. & Tazid, A. (2011). 'Fuzzy arithmetic with and without using α-cut method: a comparative study'. In International Journal of Latest Trends in Computing, Vol. 2 No. 1, pp. 99-107. [21] Pérez, I. J., Wikström, R., Mezei, J., Carlsson, C., & Herrera-Viedma, E. (2013). A new consensus model for group decision making using fuzzy ontology. Soft Computing, 17(9), 1617-1627. [22] Rodríguez, N. D., Cuéllar, M. P., Lilius, J., & Calvo- Flores, M. D. (2014). A fuzzy ontology for semantic modelling and recognition of human behaviour. Knowledge-Based Systems, 66, 46-60. [23] Scharrenbach, T. & Bernstein, A. (2009) 'On the evolution of ontologies using probabilistic description logics'. In Proceedings of the First ESWC Workshop on Inductive Reasoning and Machine Learning on the Semantic Web. [24] Simou, N., Mailis, T.P., Stoilos, G., Stamou, G.B. (2010) Optimization techniques for fuzzy description logics. In Description Logics [25] Stoilos, G., Simou, N., Stamou, G. & Kollias, S. (2006) 'Uncertainty and the semantic web: intelligent systems', IEEE Intelligent Systems, Vol. 21 No. 5, pp. 84-87. [26] Stoilos, G., Stamou, G. & Pan, J.Z. (2010). 'Fuzzy extensions of OWL: logical properties and reduction to fuzzy description logics', International Journal of Approximate Reasoning, Vol. 51 No. 6, pp. 656-679. [27] Straccia, U. (2001) 'Reasoning within fuzzy description logics', Journal of Artificial Intelligent Research (JAIR), Vol. 14, pp. 137-166. [28] Straccia, U. (2012) 'Description Logics with Fuzzy Concrete Domains'. arXiv preprint arXiv:1207.1410. [29] Straccia, U. (2013). Foundations of Fuzzy Logic and Semantic Web Languages. CRC Press. [30] Straccia, U. (2015). 'All About Fuzzy Description Logics and Applications'. In Reasoning Web. Web Logic Rules. Springer International Publishing pp. 1- 31. [31] Tsatsou, D., Dasiopoulou, S., Kompatsiaris, I., & Mezaris, V. (2014). LiFR: A Lightweight Fuzzy DL Reasoner. In The Semantic Web: ESWC 2014 Satellite Events (pp. 263-267). Springer International Publishing. [32] Werro, N. (2015). Fuzzy classification of online customers. Springer. ISBN : 3319159690 9783319159690 [33] Zablith, F., Antoniou, G., d'Aquin, M., Flouris, G., Kondylakis, H., Motta, E., ... & Sabou, M. (2015). Ontology evolution: a process-centric survey. The Knowledge Engineering Review, 30(01), 45-75. [34] Zadeh, L.A. (1965) 'Fuzzy sets', Information and Control, Vol. 8 No. 3, pp. 338-353. doi:10.1016/S0019-9958(65)90241-X. 220 Informatica 41 (2017) 209–219 A. Djellal et al. Informatica 41 (2017) 221–232 221 SK-languages as a Powerful and Flexible Semantic Formalism for the Systems of Cross-Lingual Intelligent Information Access Vladimir A. Fomichov School of Business Informatics, Faculty of Business and Management National Research University Higher School of Economics (HSE), Kirpichnaya str. 33, 105187 Moscow, Russia E-mail: vfomichov@hse.ru, http://www.hse.ru/eng/org/persons/67739 Keywords: semantic parsing, theory of K-representations, abstract meaning representation, formal representation of semantic content Received: January 26, 2017 The first starting point of this paper is the broadly accepted idea of employing, as a promising methodology, an artificial semantic language-intermediary for the realization of automatic cross-lingual intelligent information access to natural language (NL) texts on the Web. The second one is the emergence in computational semantics during 2013-2016 of great interest in the semantic formalism (more exactly, notation) called Abstract Meaning Representation (AMR). This formalism was introduced in 2013 in an ACL publication by a group consisting of ten researchers from UK and USA. This paper shows that much broader prospects for creating semantic languages-intermediaries in comparison with AMR are opened by the theory of K-representations (TKR), developed by V. A. Fomichov. The basic mathematical model of TKR describes the regularities of NL structured meanings. The mathematical essence is that this model introduces a system consisting of ten partial operations on conceptual structures. Initial version of this model was published in 1996 in Informatica (Slovenia). The second version of the model (stated in a monograph released by Springer in 2010) defines a class of formal languages called SK-languages (standard knowledge languages). It is demonstrated that SK-languages allow us to simulate all expressive mechanisms of AMR. The advantages in comparison with AMR are, in particular, the possibilities to construct semantic representations of compound infinitive constructions (expressing goals, commitments, etc), of compound descriptions of notions and sets, and of complex discourses and knowledge pieces. Povzetek: Opisani so SK-jeziki za fleksibilno med-jezikovno dostopanje. 1 Introduction During last decade, one has been able to observe a quickly growing interest in the design of computer intelligent agents fulfilling cross-lingual information retrieval (CLIR) on the Web. It is a consequence of emerging a huge, permanently increasing number of Web-sources in languages being different from English. In September 2012, a seminar on Multilingual Semantic Web (MSW) was organized in Germany in the Dagstuhl Castle. The proceedings of this seminar contain the following data [5]: in the year 2010, the number of non-English-speaking Internet users was three times higher than the number of English-speaking users (1430 million vs. 536 million users). That is why the problem of developing an MSW is very topical [24-26, 35, 49, 56]. It is broadly accepted that a promising approach to the realization of CLIR on the Web is employing a special semantic language-intermediary (SLI) in order to represent in the same format both semantic content of a user query and semantic content of the analysed fragment of a text in natural language (NL) [4, 7, 13-20, 24-26, 30, 32, 46, 49, 51, 52, 56]. The problem of creating a broadly applicable and flexible SLI goes far beyond the scope of CLIR. During last decade, the semantic parsing branch of computational linguistics has been considerably strengthened and expanded [36]. The main objective of this branch is to develop and implement the algorithms extracting meanings from NL-texts and forwarding them to pragmatic subsystems of applied intelligent systems. The real resurrection of semantic parsing branch (after two decades when statistics-oriented approaches to NL processing dominated) has been caused, first of all, by the stormy progress in designing autonomous intelligent agents (robots) and various mobile devices (cell telephones, planchettes, etc.) [36, 44, 45]. Another reason is the problem of understanding Web-sources in many natural languages on requests of the end users or on requests of computer intelligent agents. The use of SLI is also reasonable in full text question-answering systems and in NL interfaces (in particular, to robots and mobile devices) even in case of the texts in one language. There is one more circumstance showing high topicality of developing broadly applicable and flexible SLIs. During last decade, several IT-companies have emerged in different countries whose principal objective is to combine the informational technologies of Semantic 222 Informatica 41 (2017) 221–232 V.A. Fomichov Web and NL processing. In particular, these are Ontos GmbH in Swizerland [40, 53] and Cambridge Semantics Inc., The Smart Data Company in Boston, MA, USA [6]. During last decade, many scholars have seen a reasonable way of creating preconditions of understanding NL-texts by computer systems in developing special linguistic databases containing sentences associated with manually constructed semantic representations (SRs); in other terms, associated with semantic annotations. Since the year 2013, numerous papers have been published devoted to employing the notation called Abstract Meaning Representation (AMR) for constructing semantic annotations of NL sentences, in particular, of sentences in English, Czech, and Chinese [1, 2, 35, 43, 47, 54, 55]. The aim of this paper is to attract the attention of the researchers in computational semantics to the fact that there is a formal theory opening much broader prospects for building SRs of NL sentences and discourses in comparison with AMR. It is the theory of K- representations (knowledge representations) - an original theory of designing semantic parsers of NL-texts with the broad use of formal means for representing input, intermediary, and output data of the algorithms. Besides, it enriches the logical-informational foundations of MSW, multi-agent systems, E-commerce, knowledge representation in advanced ontologies, and knowledge representation in multi-media databases. The monographs [21, 25] state two versions of the theory of K- representations (TKR). It is an expansion of the theory of K-calculuses and K-languages (the KCL-theory). The basic ideas and results of TKR are set forth in numerous publications both in Russian and English, in particular, in [12-30]. TKR is the kernel of Integral Formal Semantics of NL, its basic principles and composition are stated in [16] and in Chapter 2 of [25]. The structure of this paper is as follows. Section 2 analyses related approaches, the main attention is paid to Semantic Role Labeling, Frame-Semantic Parsing, and Abstract Meaning Representation. Section 3 contains a task statement. Section 4 shortly describes the expressive mechanisms of SK-languages, introduced by TKR. Section 5 sets forth principal distinguished features of the algorithms of semantic parsing proposed by TKR. Section 6 shortly indicates the computer applications of TKR. Section 7 outlines the prospects of using SK-languages in the development of MSW. Section 8 concludes the paper. 2 Related approaches 2.1 Semantic role labeling branch of computational semantics The goal of extracting meaning from NL-texts (and constructing its complete or partial representation) emerged in many application domains in early 2000s and initiated a number of research projects throughout the world. The main stream in this field includes, in particular, the interrelated branches called Semantic Role Labeling (SRL) and Frame-Semantic Parsing (FSP). The principal task considered in SRL is to find semantic relations (called semantic roles) between the verbal forms (and some other predicate words) and the dependent word groups. For instance, it is possible to find semantic roles Agent, Phenomenon, and Time in the sentence “The Russian Nobel laureate Ivan Pavlov discovered conditional reflexes in the beginning of the XXth century”. The aim of SRL is, firstly, to find realized semantic roles and, secondly, to construct a formal expression called semantic representation in order to process it in the context of a discussed situation and an ontology. The fundamental problem of SRL is that in early 2000s one felt the lack of formal means allowing for reflecting semantic structure of arbitrary sentences. Example. Let S1 = “Yesterday Robert heard that the firm “Rainbow” would move to Manchester”, S2 = “Robert decided to leave the firm “Rainbow”. Regretfully, the field SRL as far as five years ago didn’t possess effective formal means for building SRs of sentences with complex direct or indirect speech, with infinitive constructions, and with modalities. In particular, it applies to the sentences S1 and S2. A significant binary event in the development of this branch was the publication of the pioneer work [31] on a computer program for statistical SRL and the creation of PropBank annotations depositary [33]. These two publications became the starting point for designing a number of applied computer systems aimed at finding predicate-argument structures reflecting semantics of sentences and short discourses. The PropBank annotations consist of phrase-structure syntax trees from the Wall Street Journal section of Penn Treebank [38] complemented by predicate-argument structures for the verbs. The PropBank uses core roles ARG0 through ARG5, and these roles have different interpretations for different predicates. There are many studies aimed at SRL and using PropBank conventions [3, 39, 42]. The problem with using predicate-argument structures is that the roles ARG2 – ARG5 serve many different purposes for different verbs [58]. A way out is provided by the branch of NL processing (NLP) called Frame-Semantic Parsing and closely connected with the branch SRL [9]. The basis of this branch is the linguistic resource FrameNet [10], it stores a significant information about lexical semantics and predicate-argument semantics of sentences in English. The FrameNet lexicon contains semantic frames, each of them includes a list of lexical units – associated words and word combinations that are able to evoke a considered semantic frame in an NL expression. Besides, each semantic frame from FrameNet indicates several roles corresponding to the facets of the scenario represented by the frame. One says that targets are the predicates (verbs, etc.) evoking frames and arguments are a word or a phrase filling a role. For example, the frame JUDGMENT from the FrameNet database contains the hand-annotated sentence “She blames the Government for failing to do enough to help”. In this sentence, the following semantic roles are distinguished: Judge in the pair (She, blames), Evaluee in the pair (blames, the Government), Reason in the pair (blames, for failing to do enough to help). In the FrameNet SK-languages as a Powerful and Flexible Semantic... Informatica 41 (2017) 221–232 223 database, the considered sentence is represented as follows: [Judge She] blames [Evaluee the Government] [Reason for failing to do enough to help]. In comparison with PropBank, containing verbal predicates, FrameNet includes not only them but also adjectives, adverbs, and prepositions. 2.2 Abstract meaning representation formalism In late 2000s and early 2010s, it was possible to observe a serious incompleteness of the field SRL. As it was mentioned above, the principal objective of the studies on SRL was to develop the methods and algorithms aimed at discovering semantic roles realized in the sentences. The purpose of discovering semantic roles is the use of this information in building SRs of sentences and discourses for interacting with pragmatic subsystems of applied intelligent systems. However, the scholars in the field SRL possessed only rather restricted formal tools for building SRs of sentences. First of all, they felt the lack of convenient formal means for building semantic images of compound objects’ and situations’ descriptions, of sentences with attribute clauses of purpose, of sentences with infinitive constructions, and of sentences expressing modalities. That is why in late 2000s and early 2010s the scholars looked for more expressive semantic formalisms. As a result, a new attention has been attracted to the semantic formalism called Abstract Meaning Representation (AMR), it was introduced in [34]. This formalism began its new life in a modified form after the publication of the paper [1]. AMR of a sentence S is an acyclic, rooted, directed graph with special marks of the vertices and edges. According to [34], a mark of a vertex has the form (label/concept), where label is a mark of an entity (e.g., label = m1) and concept is a string of the form |wd1| or |wd1,…, wdk|, where wd1, …, wdk are the words or word combinations expressing one notion (examles: |dog|, |eat, take in|). The paper [1] considers additional forms of concepts’ descriptions: the framesets of the linguistic database PropBank (“want-01”, etc.), special entity types (“world- region”, etc.), the kinds of quantities (“distance-quantity”, etc.), and logical connectives “and”, “or”. It is possible to distinguish several main reasons for explaining the quickly increasing interest in AMR. Reason 1. The possibility to explicitly indicate semantic roles in the descriptions of events. It should be noted that AMRs use generalized semantic roles arg0,…, arg5 employed in PropBank framesets [1]. Example 1 [1]. The sentence “The man described the mission as a disaster” can be associated with the AMR (d/describe-01 :arg0 (m/man) :arg1 (m2/mission) :arg2 (d/distance)). Reason 2. The possibility to build compound designations of various entities from application domains. Example 2 [1]. The expression “a singing boy from the college” can be associated with the AMR (b/boy :arg0-of (s/sing-01) :source (c/college)). Reason 3. A way of describing semantic structure of sentences with infinitive constructions. Example 3 [35]. Let T1 = “The boy wants to go to New York”. Then T1 may have the following AMR: (w/want-01 :arg0 b/boy :arg1 g/go-01 :arg0 b :arg1 c/city :wiki “New York” :name (n/name :op1 “New” :op2 “York”)). Reason 4. The possibility to describe semantic structure of sentences with modal words and infinitives. Example 4 [1]. The sentences “The boy doesn’t have to go”, “The boy isn’t obligated to go”, and “The boy need not go” may be associated with the AMR (p/obligate-01 :arg2 (g/go-01) :arg0 (b/boy) :polarity -)). Another reasons are the possibilities to describe semantic structure of (a) the questions with interrogative words; (b) noun groups (e.g., “Elsevier N.V., the Dutch publishing group”), (c) sentences expressing the conceptual qualification relation (“This woman is a lawyer”, etc.). It is possible to distinguish the following principal shortcomings of the AMR notation from the standpoint of using it in the models and algorithms of semantics- oriented NL processing. 1. Our linguistic intuition says that (a) the main words and word combinations of the sentences refer to various things, situations, and abstract entities; (b) there are various directed semantic connections between the fragments of the sentence, in particular, between such main words and word combinations. A directed graph with special marks of the vertices and edges is the structure visualizing quite well this perception of a sentence by our linguistic intuition. However, this product of scientific thought can be characterized as a surface, non deep penetration into the mechanisms of NL semantics. That is why the AMR notation makes only a rather small contribution to the creation of the models reflecting the essence of sentence understanding with respect to a knowledge base. 2. The linguistic intuition of the scholars (not only of linguists) having command of several natural languages (e.g., of Russian and English or of English, French, and German) says that there are several mental mechanisms underpinning the construction of NL semantic structures in different languages. For instance, English, Russian, French, and German do have infinitive constructions and compound descriptions of sets. However, the AMR approach doesn’t formulate any conjecture about a system of expressive mechanisms being responsible for constructing mental representations of sentences even in one language – in English. 224 Informatica 41 (2017) 221–232 V.A. Fomichov 3. Due to the above said, the AMR approach doesn’t give a special formal status to such constructions as semantic images of infinitive expressions, compound designations of sets, and of sentences with modality. That is why the AMR approach seems to be of small use for constructing semantics-oriented models of NL communication. 4. The group of general semantic relations used in AMR seems to be a huge bag containing, in particular, such relations of different kinds as :age, :destination, :consist-of, and :purpose. The first unit is the name of a function, the second – fourth units are the names of the relations being not functions. These principal peculiarities are not taken into account by the AMP approach. 5. The AMR approach says nothing about the SRs of discourses. 3 Task statement It seems to be reasonable to analyse the new demands to computational semantics in the context of the problems faced by computational linguistics (CL) as a whole. The analysis of many publications describing the projects on NLP shows the existence of a gap (very often, a huge gap) between the employed theoretical tools and the real demands of the studied problems. Let’s consider only one example. The linguistic processor BLUE (= Boeing Language Understanding Engine) was developed as an advanced information processing tool for the Boeing company. The system is able to build SRs of sentences of many kinds. In first section of one of the papers describing BLUE the authors state that the system uses the formal means of first-order logic (FOL) for constructing SRs of sentences [8]. However, we get to know from the second section of the same paper that the system BLUE “allows propositions to themselves be arguments to other propositions as a nested structuring”. For instance, the system constructs an SR of the sentence “The man wanted to leave the house”. This step immediately leads us beyond the scope of FOL. The reason is that atomic formulas of FOL can’t include the arguments being formal semantic images of infinitive constructions (“to leave the house”, etc.). That is why the Boeing system BLUE, in fact, has no adequate theoretical background. Analysing the development of CL as a whole during last twenty five years, it is possible to observe a shift to numerous engineering projects for solving particular practical tasks and the lack of attention to fundamental studies. It seems that one of the brightest descriptions of recent and current situation in CL is given by Dr. Shuly Wintner from Computer Science Department of the University of Haifa, Israel [57]. The starting point for Dr. Wintner was high appreciation of the role played by mathematical theories in the development of many branches of engineering. For instance, air dynamics underpins the design of airplanes, and hydrodynamics is the basis for constructing ships. In this connection the following questions were posed by Dr. Wintner: “What branch of science underlines NL Engineering? What is the theoretical infrastructure on which we build our applications? And what kind of mathematics is necessary for reasoning about human languages?” It would be very natural to expand this list of fundamental questions by means of adding the question posed in [36]: “How to formally represent the semantics of language?”. The need of developing a comprehensive formal framework for creating an MSW makes very up-to-date the question about mathematical foundations of computational semantics being the core of modern CL. The analysis shows that the current state of computational semantics demands the development of an applications independent semantic formalism being convenient: (a) for describing semantic structure of sentences including, in particular, infinitive and gerundial (for English) constructions expressing the goals commitments, commands, wishes, etc, the attributive clauses of purpose, complex direct and indirect speech, compound denotations of notions and sets; (b) for presenting semantic structure of discourses, in particular, of discourse with the references to the meanings of previous sentences or larger fragments of the text; (c) for building representations of knowledge pieces, including the definitions of notions; (d) for constructing formal representations of simple and compound goals of people, robots, and organizations. This combination of expressive mechanisms is not proposed by FOL, Discourse Representation Theory, Theory of Conceptual Graphs, Episodic Logic [48], and Abstract Meaning Representation. It is also possible to look at the formulated task from a more general position. The analysis of the scientific literature on semantic parsing and an MSW provides serious arguments in favour of putting forward the following conjecture: it is high time for creating a new paradigm for considering numerous theoretical problems encountered while constructing and processing various conceptual structures associated with Web-based informational sources: semantic representations of written and spoken texts’ fragments (in other terms, text meaning representations); high-level conceptual descriptions of visual images; knowledge pieces stored in ontologies; the content of messages sent by computer intelligent agents, etc. How to find a key to solving this problem? We do know that, using NL, we are able to describe various pieces of knowledge, the semantic content of a visual image, the semantic content of a film, etc. That is why it can be conjectured that a key to elaborating a new paradigm of the described kind could be the construction of a broadly applicable and flexible Conceptual Metagrammar. It is to be a collection of the rules (or partial operations) enabling us to construct step by step an SR of practically arbitrary sentence or discourse pertaining to mass spheres of professional activity of people. In [29], SK-languages as a Powerful and Flexible Semantic... Informatica 41 (2017) 221–232 225 the term “a comprehensive semantic formal environment” is used in the same sense. The prefix “meta” in the term “metagrammar” means that such rules are to use the information associated with the classes of conceptual units. That is why we should be able to employ the same system of rules with different conceptual vocabularies. 4 Theory of K-representations as a source of a broadly applicable and flexible semantic formalism Happily, a solution to the formulated problem is already available. It is given by the theory of K-representations (TKR). It should be underlined that its approach to describing semantic structure of NL-texts is free from the listed shortcomings of AMR. In order to better understand the peculiarity of TKR, let’s establish an analogy with bionics. Bionics studies the peculiarities of the structure and functioning of the living beings in order to discover the new ways of solving certain technical problems. TKR was developed as a consequence of fulfilling a system analysis of the basic expressive mechanisms of NL and putting forward a conjecture about a system of partial operations on conceptual structures underpinning these expressive mechanisms. 4.1 Two versions of a broadly applicable and flexible conceptual metagrammar The first basic constituent of TKR is two versions of a mathematical model (Model 1) describing a system of ten partial operations on conceptual structures. The first version (Model 1-A) is published in [17]. It should be noticed that the 9th operation introduced in [17] is modified in [18]. Model 1-A is the kernel of the theory of restricted standard knowledge languages (RSK- languages). The predecessor of this theory is the theory of S-calculuses and S-languages (see [11] and a retrospective outline in Section 2.3 of [25]). The second version (Model 1-B) is published in the monographs [21, 25] and is the kernel of the theory of standard knowledge languages (SK-languages). Each version of the Model 1 gives us formal means being convenient for building SRs of, likely (it is a hypothesis), arbitrarily complex sentences and discourses in NL pertaining to mass spheres of professional activity (engineering, business, medicine, etc.). The difference between the Models 1-A and 1-B is as follows. Model 1-A allows us to proceed from only one angle of look at an entity from the considered thematic domain. To the contrary, Model 1-B makes it possible to consider an entity from several possible angles of look. Example. Both Model 1-A and Model 1-B consider a finite set of symbols St and the countable non- intersecting sets of symbols X and V. The elements of the sets X and V are interpreted respectively as primary informational units and variables. The set St (its elements are called sorts) is a subset of X. Suppose also that the Model 1-A includes a mapping tp1 from the union of X and V into the countable set of symbols Types1, and the Model 1-B includes a mapping tp2 from the union of X and V into the countable set of symbols Types2. Here Types1 and Types2 contain the symbols and strings interpreted as semantic characteristics of entities from the considered domains. Both Types1 and Types2 include the subset of sorts St, and Types1 is a subset of Types2. Suppose that X includes the unit D.Mendeleev, it denotes the famous Russian chemist Dmitry I. Mendeleev, the author of the periodical table of elements. Let St include the sorts ints and dyn.phys.ob (“intelligent system” and “dynamic physical object”). Then it is possible that either tp1(D.Mendeleev) = ints or tp1(D.Mendeleev) = dyn.phys.ob, but tp2(D.Mendeleev) = ints * dyn.phys.ob. Subsection 4.3 very shortly, without numerous mathematical details, characterizes ten partial operations from Model 1-A and Model 1-B. Due to a very general level of discussion, the material of Subsection 4.3 illustrates the partial operations both from Model 1-A and Model 1-B. Due to the lack of mathematical details, the shortly described operations may seem to be very simple. However, Model 1-A and Model 1-B are strictly mathematical models, they define respectively new classes of formal languages: the classes of RSK-languages and SK-languages. These models were developed due to the invention of an original methodology of constructing inductive definitions of formal objects with complex structure (see [17, 25]). The analysis of the scientific literature on artificial intelligence theory, mathematical and computational linguistics shows that today the class of SK-languages opens the broadest prospects for building semantic representations (SRs) of NL-texts (i.e., for representing meanings of NL-texts in a formal way). 4.2 The models of linguistic database and algorithms of semantic parsing The second basic constituent of TKR is two broadly applicable mathematical models of a linguistic database (LDB) [21, 25]. The models describe the frames expressing the necessary conditions of the existence of semantic relations, in particular, in the word combinations of the kinds “Verbal form (verb, participle, gerund) + Preposition + Noun”, “Verbal form+ Noun”, “Noun1 + Preposition + Noun2”, “Noun1 + Noun2”, “Number designation + Noun”, “Attribute + Noun”, “Interrogative word + Verb”. The expressive power of SK-languages enables us to associate the lexical units with the appropriate simple or compound semantic units. The models describe the logical structure of LDB being the components of NL-interfaces to intelligent databases as well as to other applied computer systems. The third basic constituent of TKR is several complicated, strongly structured algorithms carrying out semantic parsing of texts from some practically interesting sublanguages of NL. The first and second algorithms, called SemSyn and SemSynt1 respectively, are based on the elaborated formal models of LDB. The algorithm SemSyn [21] transforms a NL-text in its SR being a K- representation, the algorithm SemSyn is described in two 226 Informatica 41 (2017) 221–232 V.A. Fomichov final chapters of the monograph [21], and the algorithm SemSynt1 is set forth in Chapters 9 and 10 of the monograph [25]. An important feature of these algorithms is that they don’t construct any syntactic representation of the inputted NL-text but directly find semantic relations between text units. Since numerous lexical units have several meanings, the algorithm uses the information from a linguistic database and linguistic context for choosing one meaning of a lexical unit among several possible meanings. The other distinguished feature is that these structured algorithms are completely described with the help of formal tools, that is why they are problem independent and don’t depend on a programming system. The algorithm SemSyn is implemented in the programming language PYTHON. Additional information about the algorithms of semantic parsing proposed by TKR can be found in Section 5. 4.3 About ten partial operations on conceptual structures The expressions of SK-languages will be called below the K-strings. If Expr is an expression in NL and a K-string Semrepr can be interpreted as an SR of Expr, then Semrepr will be called a possible K-representation (KR) of the expression Expr. The KRs of NL-texts are formed from the primary informational units, the variables, and several service symbols by means of an iterative process of applying the operations of building well-formed formulas Op[1], …, Op[10]. The initial set of simplest formulas is determined by a special formal object called a conceptual basis (c.b.) and playing the role of the simplest knowledge base [21, 25]. The language determined by the considered c.b. B and the operations Op[1], … Op[n] (they are defined by the special statements, or rules, P[1], …, P[10]) is denoted as Ls(B) and is called the standard knowledge language (SK- language) in the basis B [21, 25]. The rule P[0] provides an initial stock of formulas. For example, if the string mouse1 is an element of a certain primary informational universe X(B), then mouse1 is a formula of Ls(B). For arbitrary c.b. B, let Degr(B) be the union of all Cartesian m-degrees of Ls(B), where m is not less than 1. Then the meaning of the rules of constructing well-formed formulas P[1], ..., P[10] can be explained as follows: for each k from 1 to 10, the rule P[k] determines a partial unary operation Op[k] on the set Degr(B) with the value being an element of Ls(B). Let’s consider a short introduction to the partial operations for constructing formal representations of structured meanings Op[1], …, Op[10]. The operation Op[1] can be used to join intentional quantifiers to the designations of the notions and produce the formulas like certain car, certain car * (Manufacturer, IBM), all car * (Manufacturer, BMW). The operation Op[2] can be used to construct the formulas like f(a1, …, an), where f is a functional symbol, and a1, …, an are the well-formed formulas of Ls(B). For example, Area (certain country) is a well-formed formula of a certain SK-language Ls(B). The operation Op[3] can be used to construct the expressions of the form (a ≡ b). E.g., (Area (certain country) ≡ x12). The operation Op[4] can be used to construct the formulas like rel (a1, …, an), where rel is a relational symbol, and a1, …, an are the formulas of Ls(B). E.g., Less(Area (certain country), 600,000/sq.km). The operation Op[5] allows us to mark KRs by some variables from the set of variables. For example, if a part of a KR looks like certain file1 * (Extension, “.docx”) : v1, then we can refer to the expression certain file1 * (Extension, “.docx”) in another part of a K-representation, using v1. The operation Op[6] provides the possibility to construct K-representations in the form ¬Formula, for example ¬ car. The operation Op[7] allows us to use conjunction and disjunction in the formulas, e.g., (airplane  helicopter), (mathematician  painter). The operation Op[8] can be used to build compound designations of the notions in the form concept * (r1, value1) … (rn, valuen) , where concept is an element of a primary informational universe X(B) denoting a notion, r1, … rn are the names of functions or relations, and the value1 ,…, valuen are well-constructed formulas. This operation allows us to construct the formula country *(Location, Europe)(Capital, Vienna) being a KR of the expression “a country in Europe with the capital Vienna”. The operation Op[9] allows us to use quantifiers  and  like in FOL. The operation Op[10] enables us to build the representations of ordered n-tuples as the expressions of the form , where a1, … an are some well-constructed formulas. E.g., this operation can be used to construct the KR , , . These n-tuples could be used to construct representations of complex verb constructions. For example: Delete(, , ). 4.4 SK-languages as a tool of describing semantic structure of sentences Before to consider below a number of examples illustrating a correspondence between an expression in NL and its possible KR, let’s agree that the string Semrepr is to be interpreted as a possible KR of the regarded expression in NL. Compound semantic descriptions of objects and sets of objects. The key role is played by the interaction of the operations Op[8], Op[1], and Op[5]. Using the operation Op[8] at the last step of constructing a formula and any of the operations Op[1], …, Op[10] at the previous steps, it is possible to construct an expression of the form conc * (rel1, d1)…(reln, dn), where conc is a simple (non-structured) designation of a notion, n ≥ 1, for k = 1,…, n relk either is a name of the function with one argument or the name of a binary relation. In the first case dk designates the value of the function relk and in the SK-languages as a Powerful and Flexible Semantic... Informatica 41 (2017) 221–232 227 second case dk designates the second attribute of the relation relk . Applying consequently the operations Op[1] and Op[5], we can obtain an expression of the form qtr conc * (rel1, d1)…(reln, dn): var, where qtr is an intensional quantifier (in particular, it may correspond to the meanings of the words and expressions “a certain”, “any” “all”), var is a variable. Example 1. We can construct compound designations of the entities mentioned in texts. For example, the expression “a French textbook on biology” may be associated with the semantic image certain textbook1 *(Country, France)(Activity-field, biology) : x15. Example 2. It is possible to build compound designations of the mentioned sets, e.g., certain set1 * (Number-of-elements, 4)(Qualitative-composition, container1 * (Content1, ceramics * (Country-producer, (India OR China)))) : S7, where set1 designates the notion “finite set”. Building semantic representations of compound infinitive constructions. Example 1. Let Goal1 = “To receive a M.Sci. degree in business informatics at the Higher School of Economics (Moscow) and to found a company on e-business”. Then a possible K-representation of Goal1: (receiving1 * (Institution-role, certain university * (Name1, “Higher School of Economics”)(Location, certain city * (Name1, “Moscow”) : x1)) (Document-role, certain acad-degree * (Kind, M.Sci.)(Field1, business-informatics) : x2)  founding1 * (Organization-role, certain firm1 * (Field1, e-business) : x3)) Representation of the meanings of sentences with indirect speech. Let T1 = “When Mr. Peter Smith announced that he would visit Montpelier in April?”. Then Semrepr = Question (t1, Situation (e1, informing1 * (Time, certain mom * (Before, #now#) : t1) (Agent1, certain man * (First-name, “Peter”)(Surname, “Smith”) : x1)(Inform-content, Situation (e2, visit1 * (Agent1, x1)(Location2, certain city * (Name1, “Montpellier”) : x2)(Time, Nearest-month-future (April, #now#))) ))). Representing the meanings of sentences with subordinate clauses of purpose. Let T2 = “Mr. Peter Smith, a Vice-President of the firm “Rainbow”, announced yesterday that he would visit Montpelier in April in order to sign an agreement with the company “CIRAD”. Then Semrepr = Situation (e1, informing1 * (Time, Previous-day ( #now#)) (Agent1, certain man * (First- name, “Peter”)(Surname, “Smith”) : x1)(Inform-content, Situation (e2, visit1 * (Agent1, x1)(Location2, certain city * (Name1, “Montpellier”) : x2)(Time, Nearest-month- future(April, #now#))(Goal, signing2 * (Inform-object, certain agreement1 : x3 )(Business-partner, certain company1 * (Name1, “CIRAD”) : x4))))). Semantic representation of the homogeneous members of sentence. Let T3 = “Jean would ike to visit during this summer either Vienna, Bratislava, and Prague or Bergen, Oslo, and Stockholm”. Then Semrepr = Situation (e1, intention * (Time, #now#) (Emotional-agent, certain man * (First-name, “Jean) : x1)(Goal, visit1 * (Time, Nearest-season(summer, #now#))(Location2, ((certain city * (Name1, “Vienna” : x2)  certain city * (Name1, “Bratislava” : x3)  certain city * (Name1, “Prague” : x4)  (certain city * (Name1, “Bergen” : x5)  certain city * (Name1, “Oslo” : x6)  certain city * (Name1, “Stockholm” : x7)))))). Semantic descriptions of the expressions with the words “a notion”, “a term”. Let S1 = “The term gene was first coined in 1909 by a Danish botanist, Johannsen, and was derived from the term pangen introduced by De Vries. Then Semrepr1 = Situation (e1, introduction1 * (Notion- name, certain notion * (called, “gene”) : c1)(Agent1, certain botanist1* (Surname, “Johannsen”)(Country- role, Denmark) : x1)(Time, 1909))  Situation (e2, derivation1 * (Notion-name, c1)(Agent1, x1)(Source- notion, certain notion * (Called, “pangen”) (Authorship, certain person * (Surname, “De Vries”) : x2). 4.5 SK-languages as a tool of describing semantic structure of discourses and representing knowledge pieces Example 1. Let Disc = S1. S2, where S2 = “This information is given in the textbook “Emery’s Elements of Medical Genetics” by D. Turnpenny and S. Ellard, its 12th edition was published by Elsevier in 2005”. Then Disc may have a KR of the form (Semrepr1 : P1  Information-source (P1, Semrepr2)), where Semrepr2 is the following possible KR of the sentence S2 : certain textbook1 * (Title, “Emery’s Elements of Medical Genetics”)(Authorship, (D. Turnpenny  S. Ellard))(Edition-number, 12)(Publishing-house, Elsevier)(Year, 2005) : x3. Here P1 is the variable marking the meaning of the first phrase of the text Disc. Example 2. Let Def = “Control gene is a gene which can turn other genes on or off”. Then Semrepr3 = (Control-gene ≡ gene * (Is-able, (turning- on * (Object-bio, some gene : Set1)  turning-off * (Object-bio, Set1)))). Example 3. It is possible to construct a different KR of the definition Def, it will reflect the metadata of information piece, indicating the edition, the authors, and year of publication. In this case Semrepr-with-metadata = certain inform-object * (Content1, Semrepr3)(Authorship, (D.Turnpenny  S.Ellard))(Publishing-house, Elsevier)(Year, 2005) (Title, “Emery’s Elements of Medical Genetics”)(Edition-number, 12). 5 Principal distinctive features of two original approaches to semantic parsing The theory of K-representations not only introduced a new class of formal languages (the class of SK-languages) for 228 Informatica 41 (2017) 221–232 V.A. Fomichov building SRs of complex sentences and discourses. It also used the definition of this class of formal languages as a starting point for developing two broadly applicable mathematical models of a linguistic database ([22], Chapter 6 of [21], and Chapter 7 of [25])) and an original method of extracting structured meanings from NL-texts (Chapter 8 of [25]). We use this term for denoting a method of developing the multilingual algorithms of semantic-syntactic analysis of texts in NL. Such algorithms transform the texts from certain sublanguages of NL into SRs (in other terms, text meaning representations). For building SRs, the class of SK- languages is used. The input texts may be at least from broad and practically interesting sublanguages of English, German, and Russian languages. The proposed method underpinned the development of a multilingual algorithm of semantic parsing SemSynt1 (Chapters 9 and 10 of [25]). It is the composition of two algorithms called BuildMatr1 and BuildSem1. The algorithm BuildMatr1 can be qualified as an original algorithm of semantic role labeling. The input texts may be the questions of many kinds, the commands, the sentences, and the discourses. The output of BuildMatr1 (more exactly, its principal part) is a special string-digital matrix Matr called a matrix semantic-syntactic representation (MSSR) of the input text. The matrix Matr is dynamically linked with an auxiliary data structure being a two-dimensional array Arls. In case an elementary meaningful text unit (or a token) wd has N different meanings, the array Arls will include N consequent rows, where for k = 1, …, N the N-th row stores the information associated with the k-th meaning of wd. The configuration of an MSSR Matr changes during semantic-syntactic processing of the input text. Each configuration determines, in particular, a marked oriented graph with the vertices being the distinguished elementary meaningful text units (or tokens) and a mapping from the subset of the vertices of this graph corresponding to lexical items to the set of meanings (or values) associated with these lexical items via the array Arls. Before the start of text’s processing, an edge from each lexical unit wd goes to the first row of Arls (that is, the row with the minimal ordered number) storing the semantic units associated with wd. Figure 1 illustrates this situation for processing the command “Download the green container on the platform”. Here V1[1] is the value downloading1 (downloading a file), V1[2] is the value downloading2 (downloading a transportable physical object); V2[1] is the value green-colour, V2[2] is the value not-ripe, V2[3] designates the value a-member-of-green-movement; V3[1] is the value thing-container; V3[2] is the value data-structure-of-RDF; V4[1] is the value computer- platform, V4[2] is the value railway-station-platform, V4[3] is the value political-platform. Figure 2 illustrates the final situation. The output of the algorithm BuildMatr1 is the input of the algorithm BuildSem1. It transforms the information represented by an MSSR Matr of the input text into its possible SR. It is a KR of the input text. Example. The command “Download the green container on the platform” can be associated with its possible KR of the form Command (#Operator#, #Executor#, #now#, downloading2 * (Object1, certain thing-container * (Colour, green) : x1)(Destination, certain railway- station-platform : x2)). The paper [44] expands the method introduced in Chapter 6 of [25]. On the one hand, the input language of the algorithm BuildMatr1 is enriched by means of the phrases expressing (a) the values of functions, (b) the restrictions of the functions’ values, (c) the relations between various objects formed with the help of comparative adjectives. On the other hand, it is well known that many notions corresponding to the words and word combinations from NL-texts are too general in order to be used for the interaction with a database. For instance, these are the concepts “IT-specialist” and “alumni”. That is why it is proposed to use for semantic parsing of NL-texts not only a linguistic database but also a linguistic knowledge base (LKB). It may consist of the K-strings of the form illustrated by the following example: (IT-specialist ≡ person * (Qualification, (programmer  database-administrator  web-programmer)). Let’s call unfolding concepts the concepts being the left parts of some expressions in the LKB. The proposed final step of processing NL-texts is to replace all semantic items from the constructed primary SR belonging to the subclass of unfolding concepts by the less general concepts with the help of the definitions stored by the used LKB (it may Figure 1: Initial graph and mapping determined by an MSSR Matr. Figure 2: Final graph and mapping determined by an MSSR Matr. download the containergreen the platform Tokens Fragments of the array Arls V1[1] V3[1]V2[1] V4[1]V1[2] V3[2]V2[2] V4[2]V2[3] V4[3] V1[1] V3[1]V2[1] V4[1]V1[2] V3[2]V2[2] V4[2]V2[3] V4[3] download Tokens Fragments of the array Arls the containergreen the platform SK-languages as a Powerful and Flexible Semantic... Informatica 41 (2017) 221–232 229 be interpreted as a part of ontology). E.g., the concept “IT- specialist” will be replaced by the compound concept person * (Qualification, (programmer  database- administrator  web-programmer)). The paper [45] introduces a highly compact way of describing formal structure of linguistic databases (semantic-syntactic component) and of presenting the algorithms of semantic parsing. The paper contains the algorithm of semantic parsing SemSyntRA, developed under the framework of the proposed approach (see also next section). 6 Applications of the K- representations theory The arguments stated above and numerous additional arguments set forth in the monograph [25] give serious grounds to conclude that the class of SK-languages, provided by TKR, can be interpreted as the first comprehensive semantic formal environment for studying various semantics-associated problems of developing an MSW. It seems to be reasonable to say about two levels of applying TKR to solving practical tasks. The first level is the direct use in the design of NL processing systems of a mathematical model of a linguistic database introduced in Chapter 7 of the monograph [25] and of the algorithm of semantic parsing SemSynt1 described in Chapters 9 and 10 of the same monograph. This algorithm is multilingual: its input texts may be the questions of many kinds, statements, and commands from the sublanguages of English, German, and Russian languages. The mentioned model and algorithm were applied by the author and his Ph.D. students to the design of an NL-interface of a recommender system [41], to the design of an advanced semantic search system [30], and to the design of an NL- interface of an applied intelligent system making easier the interaction of a user with the file system of a computer [44, 45]. Two versions of this system are called NLC-1 [44] and NLC-2 [45] (here NLC = Natural Language Commander). Example. Let’s look how NLC-1 processed the following user instruction: “Copy music files from “Download” folder to folder with name “Music” or “My music” on backup drive if their size is less than 1 GB”. This instruction has the following primary K- representation constructed by SemSynt2 – a modification of the algorithm SemSynt1: If-then(Less(SizeOf(all music1*(Place1, certain folder1 * (Name1, "Download")):o1), 1/GB), Command (#Operator#, #Executor#, #now#, copying*(Source1, o1)(Destination1, certain folder1 * (Name1, ("Music"  "My music"))(Place1, certain backup-drive)))). Now if knowledge base of NLC-1 contains the K- strings (music1 ≡ file1 * (Extension, ("mp3"  "ogg"  "wav"  “aac")), (backup-drive ≡ drive1 *(Name1,"F")) and knowledge management system includes the rule (x, (x ≡ y) ├ y), then NLC-1 transforms the constructed primary KR of user instruction into the secondary KR If-then(Less(SizeOf(all file1*(Extention, ("mp3"  "ogg"  "wav"  "aac"))(Place1, certain folder1 * (Name1, "Download")) : o1), 1/GB), Command((#Operator#, #Executor#, #now#, copying*(Source1,o1)(Destination1, certain folder1 *(Name1, ("Music"  "My music"))(Place1, certain drive1 * (Name1, "F"))))). Then the result shell script is if [ $(du -cb "Download/*.mp3" "Download/*.ogg" "Download/*.wav" "Download/*.acc"|grep total|sed -e "s/\s.*$//g") -le 1000000000 ]; then cp "Download/*.mp3" "Download/*.ogg" "Download/*.wav" "Download/*.acc" $(ls /f/|grep -iE "^Music$|^My music$" / head -n1); fi. Written in Haskell programming language, NLC-1 is a flexible and scalable application. It can be configured by a researcher for different domains and underlying shells. The paper [45] describes a modified theoretical foundation of the second version NLC-2. The great advantages of the proposed comprehensive semantic formal environment are promised by the second level of applications: it is the case of using SK-languages for describing lexical semantics, representing semantic content of sentences and discourses in NL, building models of advanced ontologies, constructing semantic annotations of Web-documents (see Section 6.2 of [25]), and forming high-level conceptual descriptions of visual images (see Section 6.3 of [25]) in numerous scientific centres and research groups throughout the world. 7 A contribution to developing a Multilingual Semantic Web The process of endowing the existing Web with the ability of understanding many natural languages is an objective ongoing process. The analysis has shown that there is a way to increase the total successfulness, effectiveness of this global decentralized process. It would be especially important with respect to the need of cross-language conceptual information retrieval and question - answering. The way proposed in [25-29] is a possible new paradigm for the mainly decentralized process of endowing the existing Web with the ability of processing many natural languages. The principal idea of a new paradigm is as follows. There is a common thing for the various texts in different natural languages. This common thing is the fact that the NL-texts have the meanings. The meanings are associated not only with NL-texts but also with the visual images (stored in multimedia databases) and with the pieces of knowledge from the ontologies. That is why the great advantages are promised by the realization of the situation when a unified formal semantic environment is being used in different projects throughout the world for reflecting structured meanings of the texts in various natural languages, for representing 230 Informatica 41 (2017) 221–232 V.A. Fomichov knowledge about application domains, for constructing semantic annotations of informational sources and for building high-level conceptual descriptions of visual images. The analysis of the expressive power of SK-languages (see the chapters 3 – 6 of [25] shows that the SK-languages can be used as a unified formal semantic environment of the kind. This idea underlies an original strategy of transforming step by step the existing Web into a Semantic Web of a new generation, where its principal distinguished feature would be the well-developed ability of NL processing; it can be also qualified as a Multilingual Semantic Web. The versions of this strategy are published in [25-29]. 8 Conclusion Computational semantics has received a firm theoretical ground. The SK-languages, introduced by the theory of K-representations, open new prospects for formalizing lexical semantics, representing semantic content of sentences and discourses in NL, building models of advanced ontologies, forming high-level conceptual descriptions of visual images, and constructing semantic annotations of Web-documents in numerous scientific centres and research groups. Many existing projects on NL processing including semantic parsing have received an appropriate theoretical framework for next stages of research. For an MSW it is also very important that SK- languages provide a convenient intermediary level for moving from NL input to OWL-based ontologies. This paper provides additional arguments in favour of the conjecture formulated in [24-29]: TKR can be and should be used as a comprehensive and flexible basic formal tool for solving the tasks of developing an MSW associated with semantics of NL. References [1] Banarescu, L., Bonial, C., Cai, S., Georgescu, M., Griffitt, K., Hermjakob, U., Knight, K., Koehn, P., Palmer, M., Schneider, N. (2013). Abstract Meaning Representation for Sembanking. In: Proceedings of the 7th ACL Linguistic Annotation Workshop and Interoperability with Discourse, Sofia, Bulgaria, August 8-9, 2013 (2013)(www.aclweb.org/anthology/W13-2322; retrieved 2016-03-12) [2] Banarescu, L., Bonial, C., Cai, S., Georgescu, M., Griffitt, K., Hermjakob, U., Knight, K., Koehn, P., Palmer, M., Schneider, N. (2015). Abstract Meaning Representation (AMR) 1.2.2 Specification; github.com/amrisi/amr- guidelines/blob/master/amr.md. [3] Blanco E., Moldovan D. (2014). Leveraging verb- argument structures to infer semantic relations. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, Gothenburg, Sweden, April 26-30, 2014. ACL, pp. 145-154. [4] Bordes, A., Glorot, X., Westion, J., Bengio, Y. (2012). Joint learning of words and meaning representations for open-text semantic parsing. Proc. of the 15th Intern. Conf. on Artificial Intelligence and Statistics (AISTATS) 2012, LasPalmas, Canary Islands. Vol. 22, pp. 127-135. [5] Buitelaar P., Choi K.-S. Cimiano P., Hovy E. H. (Eds.)(2012). Report from Dagstuhl Seminar 12362 “The Multilingual Semantic Web” (2 – 9 September, 2012). Schloss Dagstuhl: Leibniz-Zentrum fuer Informatik. [6] Cambridge Semantics Inc., The Smart Data Company, Web-page (retrieved 14.10.2016).; http://www.cambridgesemantics.com/semantic- university/nlp-and-semantic-web#.. [7] Cimiano, P., Haase, P. et al. (2008). Towards Portable Natural Language Interfaces to Knowledge Bases – the Case of the ORAKEL System. Data and nowledge Engineering, Vol. 65, No. 2., pp. 325-354. [8] Clark P., Harrison P. (2008). Boeing’s NLP System and the Challenges of Semantic Representation. In: Proc. SIGSEM Symposium on Text Processing (STEP’08), Venice, Italy, ACL, pp. 263-276. [9] Das, D., Chen, D., Martins, A. F. T., Schneider, N., Smith, N. A. (2014). Frame-Semantic Parsing. Computational Linguistics, Vol. 40, No. 1, pp. 9-56. [10] Fillmore C., Johnson C. R., Petruck M. R. L. (2003). Background to FrameNet. International Journal of Lexicography, Vol. 16, No. 3, pp. 235-250. [11] Fomitchov, V. A. (1984). Formal systems for natural language man-machine interaction modelling. Artificial Intelligence. Proc. of the IFAC Symposium. Leningrad, USSR, 4-6 Oct. 1983, Ponomaryov, V.M. (ed.), Oxford, UK, Pergamon Press Ltd., New York, Pergamon Press Inc., 1984, pp. 203-207 (IFAC Proc. Series, 1984, No. 9). [12] Fomichov, V. A. (1988) Representing Information by Means of K-calculuses. Textbook. Moscow, The Moscow Institute of Electronic Engineering, 1988. [13] Fomichov, V. A. (1992). Mathematical models of natural-language-processing systems as cybernetic models of a new kind. Cybernetica. Quarterly Review of the International Association for Cybernetics (Belgium, Namur).,Vol. 35, No. 1, pp 63–91. [14] Fomichov, V. A. (1993). Towards a mathematical theory of natural language communication. Informatica. An Intern. Journal of Computing and Informatics (Slovenia), ol. 17, No. 1, pp. 21–34. [15] Fomichov, V. A. (1993). K-calculuses and K- languages as powerful formal means to design intelligent systems processing medical texts. Cybernetica (Belgium), 993, Vol. XXXVI, No. 2, pp.161-182. [16] Fomichov, V. A. (1994). Integral Formal Semantics and the design of legal full-text databases. Cybernetica (Belgium), ol. XXXVII, No. 2, pp. 145- 177. [17] Fomichov, V. A. (1996). A mathematical model for describing structured items of conceptual level. SK-languages as a Powerful and Flexible Semantic... Informatica 41 (2017) 221–232 231 Informatica. An International Journal of Computing and Informatics (Slovenia), Vol. 20, No. 1. pp. 5–32. [18] Fomichov V. A. (1998). Theory of restricted K- calculuses as a comprehensive framework for constructing agent communication languages. Fomichov, V.A., Zeleznikar, A.P. (eds.), Special Issue on NLP and Multi-Agent Systems. Informatica. An Intern. Journal of Computing and Informatics (Slovenia), Vol. 22, No. 4, pp. 451-463. [19] Fomichov, V. A. (2000). An ontological mathematical framework for electronic commerce and semantically-structured Web. Zhang, Y., Fomichov, V.A., Zeleznikar, A.P. (Eds.) Special Issue on Database, Web, and Cooperative Systems. Informatica. An Intern. Journal of Computing and Informatics (Slovenia), Vol. 24, No. 1, pp. 39-49. [20] Fomichov, V. A. (2002). Theory of K-calculuses as a powerful and flexible mathematical framework for building ontologies and designing natural language- processing systems. Andreasen, T., Motro, A., Christiansen, H., Larsen, H.L. (Eds.), Flexible Query Answering Systems, 5th Intern. Conference, FQAS 2002, Proceedings, Lecture Notes in Artificial Intelligence, Vol. 2522, Springer: Berlin, Heidelberg, New York, pp. 183-196. [21] Fomichov, V.A. (2005a). The Formalization of Designing Natural Language Processing Systems. Moscow: MAX Press, 368 p. (in Russian). [22] Fomichov, V.A. (2005b). A new method of transforming natural language texts into semantic representations. Informational Technologies, Moscow, No. 10, pp. 25-35 (in Russian). [23] Fomichov V. A. (2007). Mathematical foundations of representing the content of messages sent by computer intelligent agents. Moscow, State University – Higher School of Economics, Publishing Hiuse “TEIS”, 2007 – 176 p. (in Russian) [24] Fomichov, V.A. (2008). A comprehensive mathematical framework for bridging a gap between two approaches to creating a Meaning- Understanding Web. Intern. Journal of Intelligent Computing and Cybernetics,. Vol. 1, No. 1, pp. 143- 163. [25] Fomichov, V. A. (2010a) Semantics-Oriented Natural Language Processing: Mathematical Models and Algorithms, Springer, New York, Dordrecht, Heidelberg, London. - 352 p. [26] Fomichov, V. A. (2010b) Theory of K- representations as a comprehensive formal framework for developing a Multilingual Semantic Web. Informatica. An Intern. Journal of Computing and Informatics (Slovenia), Vol. 34, No. 3, pp.. 387- 396. [27] Fomichov, V. A. (2011). The prospects revealed by the theory of K-representations for bioinformatics and Semantic Web. Actes de la 18e conference sur le Traitement Automatique des Langues Naturels. Actes de la 15e Rencontre des Etudiants Cercheurs en Informatique pour le Traitement Automatique des Langues. France, Montpellie, 27th June - 1st July 2011 Vol. 1: Actes: articles longs. Montpellier : AVL Diffusion, pp 5-20. [28] Fomichov, V.A. (2013). A broadly applicable and flexible conceptual metagrammar as a basic tool for developing a Multilingual Semantic Web. In: Metais E., Meziane F., Saraee M., Sugumaran V., Vadera S. (Eds.) Natural Language Processing and Information Systems. 18th Intern. Conference on Applications of Natural Language to Information Systems, NLDB 2013. Salford, UK, June 2013, Proceedings. Lecture Notes in Computer Science, Vol. 7934, Springer, Berlin, Heidelberg, pp. 249-259. [29] Fomichov, V. A. (2014) SK-languages as a comprehensive formal environment for developing a Multilingual Semantic Web. Decker H., Lhotská L., Link S., Spies M., Wagner R.R. (Eds.). Database and Expert Systems Applications, 25th Intern. Conference, DEXA 2014, Munich, Germany, September 1-4, 2014, Part I, Proceedings. Lecture Notes in Computer Science, Vol. 8644, Cham: Springer International Publishing Switzerland, pp. 394-401. [30] Fomichov, V.A., Kirillov, A.V. (2012). A formal model for constructing semantic expansions of the search requests about the achievements and failures. Artificial Intelligence: Methodology, Systems, and Applications, Ramsay A., Agre G. (Eds.), Lecture Notes in Computer Science, Vol. 7557, Springer, Berlin, Heidelberg, pp. 296–304. [31] Gildea, D., Jurafsky, D. (2002). Automatic Labeling of Semantic Roles. Computational Linguistics, Vol. 28, No. 3, pp. 245 - 288. [32] Google Hummingbird (2016); https://en.wikipedia.org/wiki/ Google_Hummingbird (retrieved 10.11.2016). [33] Kingsbury, P., Palmer, M. (2002). From TreeBank to PropBank . Proceedings. LREC, 2002. [34] Langkilde, I, Knight, K. (1998). Generation that exploits corpus-based statistical knowledge. Proc. of the 36th Annual Meeting of the ACL and 17th International Conference on Computational Linguistics, Montreal, pp. 704-710. [35] Li, B., Wen, Y., Bu, L., Qu, W., Xue, N. (2016). Annotating the Little Prince with Chinese AMRs. Proc. of LAW X – the 10th Linguistic Annotation Workshop. Berlin, Germany, August 11, 2016, ACL, pp. 7-15. [36] Liang, P. (2016). Learning executable semantic parsers for natural language understanding. Communications of the ACM, Vol. 59, No. 9, pp. 68- 76. [37] Lu, C., Xu, Y., Geva, S. (2008). Web-based query translation for English-Chinese CLIR. Computational Linguistics and Chinese Language Processing (CLCLP), pp. 61-90. [38] Marcus M. P., Marcinkiewicz M. A., Santorini B. (1993). Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics, Vol. 19, No. 2. [39] Marquez L., Carreras X., Litkowski K. C., Stevenson S. (2008) Semantic Role Labeling: an Introduction to 232 Informatica 41 (2017) 221–232 V.A. Fomichov the Special Issue. Computational Linguistics, Vol. 34, No. 2, pp. 145-159. [40] Ontos GmbH company Web-page (2016): www.ontos.com. [41] Pravikov, A.A., Fomichov, V.A. (2010). Development of a recommender system with a natural language interface on the basis of semantic objects’ mathematical models. Business Informatics. Interdisciplinary scientific-practical journal, Moscow, State University – Higher School of Economics, 2010, No. 4 (14), pp. 3-11. [42] Punyakanok V., Roth D., Yih W. T. (2008). The importance of syntactic parsing and inferencing in semantic role labeling. Computational Linguistics, Vol. 34, No. 2, pp.. 257-287. [43] Pust, M., Hermjakob, U., Knight, K. , Marcu, D., May, J. (2015). Parsing English into abstract meaning representation using syntax-based machine translation. In Proc. of the EMNLP 2015, Lisbon, pp. 1143-1154. [44] Razorenov, A. A., Fomichov, V. A. (2014). The Design of a Natural Language Interface for File System Operations on the Basis of a Structured Meanings Model. In Procedia Computer Science, Elsevier. V. 31. P. 1005-1011; open access, URL: http://authors.elsevier.com/sd/article/S18770509140 05304. [45] Razorenov, A. A., Fomichov, V. A. (2016). A new formal approach to semantic parsing of instructions and to file manager design. In Database and Expert Systems Applications, 27th Intern. Conference, DEXA 2016, Porto, Portugal, September 5-8, 2016, Part I, Proceedings. Lecture Notes in Computer Science. V. 9827. Cham: Springer International Publishing Switzerland, pp. 416-430. [46] Rindflesh, T.C., Kilicoglu, H., Fiszman, M., Roszemblat, G., Shin, D. (2011). Semantic MEDLINE: An Advanced Information Management Application for Biomedicine. Information Services and Use, IOS Press Vol. 1, pp. 15-21. [47] Sawai, Y., Shindo, H., Matsumoto, Y. (2015). Semantic structure analysis of noun phrases using abstract meaning representation. Proc. of the 53rd Annual Meeting of the ACL (Volume 2: Short papers), Beijing, pp. 851-856. [48] Schubert, L.K., Hwang, C.H. (2000). Episodic Logic meets little red riding hood: A comprehensive, natural representation for language understanding. In Iwanska, L., Shapiro, S.C. (eds.), Natural Language Processing and Knowledge Representation: Language for Knowledge and Knowledge for Language, MIT/AAAI Press, Menlo Park, CA, and Cambridge, MA, pp. 111-174. [49] Stellato, A. (2016). A language-aware Web will give us a bigger and better Semantic Web. MSW 2015. Multilingual Semantic Web. Proc. of the Fourth Workshop on the Multilingual Semantic Web (MSW4) co-located with 12th Extended Semantic Web Conference (ESWC 2015), Portoroz, Slovenia, June 1, 2015, pp. 1-14. [50] Sullivan, D. (2013). FAQ: All about the new Google “Hummingbird” algorithm. Search Engine Land, 26 September 2013, http://searchengineland.com/google-hummingbird- 172816 (retrieved 10.11.2016). [51] Uchida, H., Zhu, M., Della Senta, T. (1999). A Gift for a Millennium.. [52] Uren, V.S., Lei, Y., Motta, E.: (2008). SemSearch: Refining Semantic Search. In: Bechhofer, S., Hauswirth, M., Hoffman, J., Koubarakis, M. (Eds.), ESWS 2008, LNCS, vol. 5021. Springer, Heidelberg, pp. 874-878. [53] Vogt, M. (2016). How Natural Language Processing will change the Semantic Web. Semantics, April 13, 2016; https://2016.semantics.cc/how-natural- language-processing-will-change-semantic-web (retrieved 12.09.2016). [54] Wang, C., Xue, N., Pradhan, S. (2015). Boosting transition-based AMR parsing with refined actions and auxiliary analyzers. In Proc. of the 53rd Annual Meeting of the ACL (Volume 2: Short papers), Beijing, pp. 857-862. [55] Werling, K., Gabor, A., Manning, C.D. (2015). Robust subgraph generation improves abstract meaning representation parsing. In Proc. of the 53rd Annual Meeting of the ACL (Volume 1: Long Papers), Beijing, pp. 982-991. [56] Wilks, Y., Brewster, C. (2006). Natural Language Processing as a Foundation of the Semantic Web. Foundations and Trends in Web Science, Vol. 1, No. 3. Hanover, MA; Delft: now Publ. Inc. [57] Wintner S. (2009). What science underlies natural language engineering? Computational Linguistics, Vol. 35, No. 4, pp. 641-644. [58] Yi ,S, Loper., E., Palmer, M. (2007) .Can semantic roles generalize across genres? Proceedings of the Human Language Technologies Conference of the North American Chapter of the ACL. Rochester, NY, pp. 548-555. Informatica 41 (2017) 233–252 233 Formal Development of Multi-Agent Systems with FPASSI: Towards Formalizing PASSI Methodology using Rewriting Logic Mihoub Mazouz Department of Mathematics and Computer Science, RELA(CS)2 Laboratory University of Larbi Ben M’Hidi, Oum El Bouaghi, Algeria E-mail: mazouz_mihoub@hotmail.fr Farid Mokhati Department of Mathematics and Computer Science, RELA(CS)2 Laboratory University of Larbi Ben M’Hidi, Oum El Bouaghi, Algeria E-mail: mokhati@yahoo.fr Mourad Badri Department of Mathematics and Computer Science, Glog Laboratory University of Quebec, Trois-Rivières, Canada E-mail: Mourad.Badri@uqtr.ca Keywords: formal development of MAS, PASSI, validation, verification, rewriting logic, maude, maude-strategy, model-to-text transformation Received: June 20, 2016 Agent technology has proved its ability and efficiency in modelling complex distributed applications. During the last two decades, several MAS development methodologies have been proposed like, for instance, Gaia, Tropos and PASSI. Although these methodologies have made significant contributions to meet several challenges in the MAS development field, most of them do not use formal techniques. Formal methods, as it is well known, play a significant role in developing more reliable and robust MAS. This paper presents the Formal-PASSI methodology. Formal-PASSI is an extension of the well-known PASSI methodology. The extension consists mainly of the integration of a new formal model to the design process. The new model is based on the Maude language and its extension Maude-Strategy. It aims at offering a formal description of the MAS under development by a Model-to-Text transformation. The generated formal description is then used to validate some PASSI behavioural diagrams and check properties of both single & multi-agent abstraction levels before passing to the code model. The integration of formal methods into PASSI design process seems a good way to ensure the development of high quality agent- based applications. The proposed approach is supported by a tool (F-PTK) that we have developed and illustrated throughout the ATM case study. Povzetek: V članku je predstavljena formalna PASSI MAS metodologija, tj. multi-agentna metodologija. 1 Introduction Current computing systems became increasingly complex with high safety requirements. Agent technology has proved its ability and efficiency in modelling complex distributed applications. As well as any other technology, the emergence of the agent technology pushes the research community to propose new methodologies, languages and tools to support it and to enable a wider spread in the industry sector. Many methodologies like PASSI [1,2], Gaia [3,4], ADELFE [5,6,7], Prometheus [8], Tropos [9] and INGENIAS [10] have been proposed to facilitate and to assist the development of Multi-Agent Systems (MAS). Although these methodologies have made real progress in the MAS development field, proposing new methodologies that assist agent-based systems development is still insufficient for industrial adoption [11]. The development of such systems requires solid bases in terms of specification. Existing methodologies use abstract and/or semi-formal specifications. Although such types of specifications offer several advantages such as the readability and the facility of comprehension, they have drawbacks like ambiguity and inconsistency, which are manually difficult to detect. However, formal specifications face these drawbacks and enable the description of the system under development in a precise and unambiguous way. Using formal methods is essential to produce high quality agent-based systems at the end of the development process. In particular, integrating formal methods into the development process of MAS methodologies leads to the production of reliable systems. In order to overcome the problems quoted above, many proposals are trying to use formal methods in agent- 234 Informatica 41 (2017) 233–252 M. Mazouz et al. oriented software engineering (AOSE) (see Section 2). However, most of them present several limitations, especially; they do not use formal methods within an entire design process. Moreover, many of them are not supported by adequate tools. PASSI (Process for Agent Societies Specification and Implementation) [1, 2] is a step-by-step requirement-to- code methodology for designing and developing agent- oriented systems that integrates concepts from both Object-Oriented Software Engineering (OOSE) and MAS using UML (Unified Modelling Language) notation. PASSI covers almost of development process stages, and can be used to assist the development of general-purpose agent-oriented systems although it has evolved from a long period experiment to the development of embedded robotics applications [12]. However, being PASSI based on a semi-formal language such as UML makes the validation and verification activities less efficient. In this paper, we propose F-PASSI (Formal-PASSI), a formalization of the PASSI methodology by adding a new formal model into its design process. The extension is based on rewriting logic [13, 14] and particularly the Maude language [15,16] (and its extension Maude- Strategy [17]). The integrated model aims at offering a Maude-based formal description of the MAS under development to enrich the semantic of its UML-based design. The produced formal description is then exploited to validate PASSI behavioural diagrams (some of them until now) by formal simulation thanks to Maude, and Maude LTL model-checker [18] in order to verify system properties in both single/multi agent abstract levels. A tool was developed to support our approach. The remainder of this paper is organized as follows: In section 2, we give an overview of major related works. In section 3, we give a brief description of rewriting logic as well as Maude language (and its extension Maude- Strategy). In section 4, a brief description of the PASSI methodology is given. We introduce, in section 5, the proposed formal extension for PASSI. Our developed tool is shown in section 6. In section 7, the ATM case study is used to illustrate our approach. Finally, section 8 gives some conclusions and future work directions. 2 Related works Using formal methods in multi-agent systems development is a challenge raised by many researchers in MAS area. El Fallah-Seghrouchni et al. have presented a classification of the proposed works on formal development of MAS [19]. According to the authors, three alternatives can be captured from the literature: (A) Formal derivation: which is a kind of model-to-code transformation and aims at realizing MAS based on a given specification. (B) Enhancement of an existing methodology by integrating formal meanings to its design. (C) Proposing a new one. The fact that our work can be considered as an integration of formal methods to an existing methodology, PASSI, makes our focus in this section on works belonging to the second category. 1 http://staruml.io In [20, 21], Ball et al. have presented an incremental development process using Event-B [22] for multi-agent systems. The proposed process can be divided into two stages. In the first one, informal models based on agent concepts are constructed. In the second stage, based on the informal models, the Event-B models are constructed by the developer, which is provided by guidance to make the transformation from informal design to formal models straightforward. The constructed Event-B models are refined and decomposed into specifications of roles. In [23], a set of modelling patterns providing fault-tolerance in Event-B models of multi-agent interactions are presented. Another work proposing a new formal methodology is ForMAAD [24, 25]. ForMAAD is a model driven approach for designing agent-based application. It uses Agent Modelling Language (AML) [26] to model architectural and behavioural concepts associated with multi-agent systems; and Temporal Z [27] to guarantee a formal verification of the models. Extensions of StartUML1 tool are made to support the models they proposed. Two works using formal methods for the Tropos methodology [9] can be emphasized here. First, Fuxman et al. [28] have proposed an extension of Tropos, Formal Tropos, with a formal specification of early requirements. For that, Formal Tropos language is defined by integrating the primitive concepts of Tropos with a temporal specification language inspired by KAOS [29]. After the translation (using the implemented T-tool2) of the requirements specification written by the analyst into an intermediate language, an enhanced version of NuSMV model checker [30] performs consistency checking (“the specification admits valid scenarios”), possibility checking (“there is some scenarios for the system that respect certain possibility properties”) and assertion validation (“all scenarios for the system respect certain assertion properties”). Secondly, in [31], a mapping of 𝛽- Tropos concepts [32] into the computational logic-based framework SCIFF [33] is defined and important formal properties (soundness, completeness and termination) are identified and discussed. The formal specifications are verified using SCIFF engine. Instead of writing it manually, as in the last works, the formal specification is produced in a systematic way in Formal-PASSI thanks to F-PTK (Formal-PASSI Tool Kit), the tool we have developed, this makes it, unlike Formal Tropos, less based on the subjective judgment of the developer. Also, in Formal-PASSI, the formal specification combines, in addition to the domain knowledge, the structure and behaviour of agents composing the MAS to be exploited later to validate and verify its correctness. Instead of proposing new formal methodologies for MAS development or enhancing existing ones, other researchers have used formal methods, separately from any methodology, for particular design aspects. Fadil et al. [34] have used the B method [35, 36] to formally model interactions between agents in order to check and then prove the initial UML specification. The approach was 2 http://disi.unitn.it/~ft/ft_tool.html Formal Development of Multi-Agent Systems with... Informatica 41 (2017) 233–252 235 illustrated using Contact-Net protocol as a case study. Jemni Ben Ayed et al. [37] have presented a specification and verification technique for interaction protocols in MAS by combining AUML (Agent UML) [38] and Event- B method [22]. In their technique, the interaction protocol is modelled in an AUML protocol diagram and translated in Event B. The required IPs (Interaction Protocols) safety and liveliness properties are added to the derived specification for verification using the B4free tool1. As B method, the Z language [39] and its extension Temporal Z [27] have been the subject of many works. In [40], the authors have presented a formal approach using Temporal Z in two phases. In the specification phase, user requirements are described in an abstract way avoiding the description of implementation details. Then, based on a succession of refinements, the design phase aims at inventing a set of inter-agent (collective) behaviours as well as intra-agent (individual) behaviours, which have to satisfy the user requirements. Other works address the use of formal methods in runtime to verify some properties (that are not verifiable in design phase) as in [41], where a JADE-based formal verification methodology for MAS in semi-runtime approach has been proposed. The proposed verification process used timed trace theory to detect time constraint failures. Lapouchnian et al. [42] have proposed a combined agent-oriented requirements engineering approach using informal i* [43] models, ConGolog [44] and (its extension) CASL [45] formal specifications. Social dependencies between agents are modelled using the i* framework. This framework is used to perform an analysis of opportunities and vulnerabilities. The models are gradually made more precise by using annotated models (Annotations are introduced in [46] and extended in [47]). After that, complex processes can be formally modelled using ConGolog or CASL with subsequent verification or simulation. In [48], the authors have presented an extension of G- net formalism [49] (a type of high level Petri net) called Agent-oriented G-net to serve as high level design of intelligent agents by means of their internal states, their environment, their interactions, etc. Based on this high level design, agent architecture and detailed design for agent implementation can be derived using the ADK tool they developed. Stamatopoulou et al. [50] have presented an open framework facilitating formal modelling of multi- agent systems called OPERAS by employing two existing formal methods: X-machines [51] and PPS (Population P Systems) [52]. By using this framework, agent’s behaviour can be formally modelled and controlled over its internal states, as well as the mutations that occur in the structure of a MAS. The authors have applied the framework to swarm systems. Compared to the works discussed above, the approach we propose: (1) integrates formal methods, not separately from any methodology, but into an entire design process (PASSI design process), (2) is based on a powerful formal language, Maude, which offers many tools as Maude LTL model checker [18], (3) checks the specified properties 1 http://www.b4free.com before passing to code details, (4) is supported by a tool (F-PTK) which offers many services such as automating the production of the Maude-based formal description of the MAS under development by means of its structure (Agents, roles, tasks, action tasks) and the domain knowledge. 3 Rewriting Logic, Maude & Maude-Strategy 3.1 Rewriting Logic The rewriting logic was introduced by Jose Meseguer [13, 14] to describe concurrent systems. It makes it possible to think in a correct manner on the concurrent systems having states and evolving in terms of transitions. Indeed, the rewriting logic unifies several formal models which express concurrency as labelled transition systems [53], Petri nets [54] and CCS [55]. The basic statements of this logic are called rewriting rules and have the form: t → t' if C, where t and t' are algebraic terms describing a partial state of the concurrent system. A rewriting rule, in this case, describes a change of a partial state towards another if a certain condition C is true. Formally, a theory of rewriting is a triplet R = (∑,E,R) where:  (∑, E) an equational theory with function symbols ∑ and equations E;  R a set of labelled rewrite rules. These rules are of the form: 𝑡 → 𝑡′ (unconditional rewriting rules) or 𝑡 → 𝑡′ 𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛 (conditional rewriting rules). The unconditional rewriting rules indicate that: the term 𝑡 becomes 𝑡′, but, the conditional rewriting rules indicate that: 𝑡 becomes 𝑡′ if a certain condition is true. A theory of rewriting has a set of inference rules [13, 14]:  Reflexivity: For each [t] ∈ T∑, E (X), [𝑡]→[𝑡′]  Congruency: For each 𝑓 ∈ ∑ 𝑛, 𝑛 ∈ 𝑁 [𝑡1] → [𝑡′1] … [𝑡𝑛] → [𝑡′𝑛] [𝑓(𝑡1, … , 𝑡𝑛)] → [𝑓(𝑡′1, … , 𝑡′𝑛)]  Replacement: For each rewriting rule: r: [t(x1,…, xn)] → [t’(x1,…, xn)] in R, [𝑤1] → [𝑤1 ′] … [𝑤𝑛] → [𝑤𝑛 ′ ] [𝑡(?̅? 𝑥⁄ )] → [𝑡′(?̅?′ 𝑥⁄ )] Such as 𝑡(?̅? 𝑥⁄ ) indicates the simultaneous substitution of wi for xi in t.  Transitivity: [𝑡1]→[𝑡2] [𝑡2]→[𝑡3] [𝑡1]→[𝑡3] . Figure 1 visualizes each one of these rules. 236 Informatica 41 (2017) 233–252 M. Mazouz et al. Figure 1: Visualization of inference rules of a rewriting theory [14]. Among the languages implementing the rewriting logic, we quote CafeOBJ1 [56] and Maude [15, 16]. 3.2 Maude language Defined by J. Meseguer, the Maude language [15, 16] is one of the most powerful implementations of the rewriting logic. Maude is a high level, very powerful, declarative language for the construction of the various kinds of applications based on both equational and rewriting logics. It offers few syntactic constructions and well- defined semantics. The basic unit of specification and programming in Maude is the module. In fact, there are three types of modules: Functional modules: Define the sorts of data and the operations on these data through equational theories. The sorts of data are composed of elements which can be called by terms. A functional module is declared according to the following syntax: fmod MODULE-NAME is … endfm System modules: Specify a rewriting theory. A system module has sorts, operations and can have equations and rewriting rules, which can be conditional. A System module is declared as follows: mod MODULE-NAME is … endm The addition that a system module offers (compared to a functional module), is the ability of specifying rewriting rules. The unconditional rules are declared as follows: rl [