Volume 36 Number 4 December 2012 ISSN 0350-5596 Informatica An International Journal of Computing and Informatics Special Issue: Advances in Network Systems Guest Editors: Andrzej Chojnacki Andrzej Kowalski Bohdan Macukow Maciej Grzenda Editorial Boards, Publishing Council Informatica is a journal primarily covering intelligent systems in the European computer science, informatics and cognitive community; scientific and educational as well as technical, commercial and industrial. Its basic aim is to enhance communications between different European structures on the basis of equal rights and international refereeing. It publishes scientific papers accepted by at least two referees outside the author's country. In addition, it contains information about conferences, opinions, critical examinations of existing publications and news. Finally, major practical achievements and innovations in the computer and information industry are presented through commercial publications as well as through independent evaluations. Editing and refereeing are distributed. Each editor from the Editorial Board can conduct the refereeing process by appointing two new referees or referees from the Board of Referees or Editorial Board. Referees should not be from the author's country. If new referees are appointed, their names will appear in the list of referees. Each paper bears the name of the editor who appointed the referees. Each editor can propose new members for the Editorial Board or referees. Editors and referees inactive for a longer period can be automatically replaced. Changes in the Editorial Board are confirmed by the Executive Editors. The coordination necessary is made through the Executive Editors who examine the reviews, sort the accepted articles and maintain appropriate international distribution. The Executive Board is appointed by the Society Informatika. Informatica is partially supported by the Slovenian Ministry of Higher Education, Science and Technology. Each author is guaranteed to receive the reviews of his article. When accepted, publication in Informatica is guaranteed in less than one year after the Executive Editors receive the corrected version of the article. Executive Editor - Editor in Chief Anton P. Železnikar Volariceva 8, Ljubljana, Slovenia s51em@lea.hamradio.si http://lea.hamradio.si/~s51em/ Executive Associate Editor - Managing Editor Matjaž Gams, Jožef Stefan Institute Jamova 39, 1000 Ljubljana, Slovenia Phone: +386 1 4773 900, Fax: +386 1 251 93 85 matjaz.gams@ijs.si http://dis.ijs.si/mezi/matjaz.html Executive Associate Editor - Deputy Managing Editor Mitja Luštrek, Jožef Stefan Institute mitja.lustrek@ijs.si Executive Associate Editor - Technical Editor Drago Torkar, Jožef Stefan Institute Jamova 39, 1000 Ljubljana, Slovenia Phone: +386 1 4773 900, Fax: +386 1 251 93 85 drago.torkar@ijs.si Contact Associate Editors Europe, Africa: Matjaz Gams N. and S. America: Shahram Rahimi Asia, Australia: Ling Feng Overview papers: Maria Ganzha Editorial Board Juan Carlos Augusto (Argentina) Costin Badica (Romania) Vladimir Batagelj (Slovenia) Francesco Bergadano (Italy) Marco Botta (Italy) Pavel Brazdil (Portugal) Andrej Brodnik (Slovenia) Ivan Bruha (Canada) Wray Buntine (Finland) Zhihua Cui (China) Ondrej Drbohlav (Czech Republic) Hubert L. Dreyfus (USA) Jozo Dujmovic (USA) Johann Eder (Austria) Ling Feng (China) Vladimir A. Fomichov (Russia) Maria Ganzha (Poland) Marjan Gušev (Macedonia) N. Jaisankar (India) Dimitris Kanellopoulos (Greece) Samee Ullah Khan (USA) Hiroaki Kitano (Japan) Igor Kononenko (Slovenia) Miroslav Kubat (USA) Ante Lauc (Croatia) Jadran Lenarcic (Slovenia) Shiguo Lian (China) Huan Liu (USA) Suzana Loskovska (Macedonia) Ramon L. de Mantras (Spain) Angelo Montanari (Italy) Pavol Nävrat (Slovakia) Jerzy R. Nawrocki (Poland) Nadia Nedjah (Brasil) Franc Novak (Slovenia) Marcin Paprzycki (USA/Poland) Ivana Podnar Žarko (Croatia) Karl H. Pribram (USA) Luc De Raedt (Belgium) Shahram Rahimi (USA) Dejan Rakovic (Serbia) Jean Ramaekers (Belgium) Wilhelm Rossak (Germany) Ivan Rozman (Slovenia) Sugata Sanyal (India) Walter Schempp (Germany) Johannes Schwinn (Germany) Zhongzhi Shi (China) Oliviero Stock (Italy) Robert Trappl (Austria) Terry Winograd (USA) Stefan Wrobel (Germany) Konrad Wrona (France) Xindong Wu (USA) Editors's Introduction to the Special Issue on "Advances in Network Systems" Tremendous development of network systems and network applications observed over the last years has opened even more new research areas. The success of WWW applications, API exposition via web services or ubiquitous mobile access is immediately followed by significant challenges related to performance, security, or privacy issues of network systems. In particular, large network operators managing core high bandwidth networks and providing services to millions of end users are facing growing demand for increased data transfer, quality of service and novel services. This results in unprecedented research and development of numerous network services, applications and systems. To stimulate the cooperation between commercial research community and academia, the first edition of Frontiers in Network Applications and Network Systems symposium was organized in 2012. The main idea of providing a forum for academia and application-oriented research was fulfilled by the organizers of the event. These are: Orange Labs Poland - a part of a global chain of R&D centres or France Telecom group and two leading Polish academic communities, namely Warsaw University of Technology, Faculty of Mathematics and Information Science and Faculty of Cybernetics of the Military University of Technology. Unlike many other network-related events, the conference was a part of a larger event providing the opportunity to discuss network-related issues in a wider community. The symposium was a part of Federated Conference on Computer Science and Information Systems (FedCSIS), organized in 2012 in Wroclaw. This provided basis for active cooperation with other events of the multiconference. Among other areas, artificial intelligence (AI) and soft computing can be mentioned in this context. On the one hand, AI models are frequently used to deal with network-related problems and among other applications provide basis for Network Intrusion Detection. On the other hand, unprecedented volume of data transferred in modern network systems opens new areas in modern data analysis. This special issue includes selected extended versions of papers presented during the Frontiers in Network Applications and Network Systems symposium. The selection of papers illustrates the wide range of active research areas in modern network systems. These include the exposition of Telco infrastructure via web services i.e. opening the complex world of telecom systems via standardized web services and the benefits arising from this trend. Another aspect is the monitoring of network systems with particular emphasis on anomaly and intrusion detection. Finally, new questions raised by the constantly growing range of mobile solutions have to be answered. The first paper, entitled "E-health oriented application for mobile phones" authored by A. Podziewski, K. Litwiniuk and J. Legierski shows new perspectives created by opening telecommunication infrastructure via the set of web services. E-health application using services such as determining the approximate location of a mobile terminal is proposed. Its key part is the use of web services exposing underlying mobile network functionalities. This illustrates the promising perspective of integrating complex, mobile infrastructure capabilities with third-party applications in accordance with SOA paradigm. At the same time, this provides one more example of the need for system security, and the balance between the usability of the system and user's privacy. The intrusion detection area has been active research area for many years. The second work, entitled "Artificial Immune System Inspired Intrusion Detection System Using Genetic Algorithm" authored by Amira Sayed A. Aziz, Mostafa Salama, Aboul ella Hassanien, and Sanaa El-Ola Hanafi, contributes to this area. The authors present the use of genetic algorithms and different distance measures for network anomaly detection. The next work, "Usage of Holt-Winters model and Multilayer Perceptron in Network Traffic Modelling and Anomaly Detection", authored by M. Szmit, A. Szmit, S. Adamus and S. Bugala also contributes to this area. In particular it shows the way network-related research is frequently combined with network application development in this case being network anomaly detection application. Finally, P. Bžoch, L. Matejka, L. Pešička, and J. Šafafik in their work „Design and Implementation of a Caching Algorithm Applicable to Mobile Clients" address the need for novel caching algorithms. This refers to another aspect of modern network systems i.e. mobile network systems and the need for more efficient data handling methods addressing unique features of mobile networks. The guest editors would like to thank Professor Matjaz Gams for the opportunity to publish this special volume and share the ideas raised during the conference with international research community. The editors would like also to thank authors for sharing the results of their research and Program Committee Members for their contribution to this conference and reviewing the papers submitted to this special issue. Andrzej Chojnacki Andrzej Kowalski Bohdan Macukow Maciej Grzenda Guest Editors E-health Oriented Application for Mobile Phones Andrzej Podziewski Warsaw University of Technology, Faculty of Electronics and Information Technology 15/19 Nowowiejska Street, 00-665 Warsaw, Poland E-mail: a.podziewski@gmail.com Kamil Litwiniuk Warsaw University of Technology, Faculty of Electronics and Information Technology 15/19 Nowowiejska Street, 00-665 Warsaw, Poland E-mail: kamillitw@gmail.com Jaroslaw Legierski Orange Labs Poland, 7 Obrzežna Street, 02-691 Warsaw, Poland E-mail: jaroslaw.legierski@orange.com Keywords: service delivery platform, SDP, API exposure, telco 2.0, e-health Received: September 25, 2012 This paper presents the idea of using mobile operators APIs with an e-health usage scenario. Since numerous elderly people are going missing every year, proposed emergency location service presents a way in which mobile operators' networks, the Internet and possibilities given by rapid improvement of phones' functionalities can converge in order to relieve the problem. The description of presented solution is supplemented with sets of accuracy measurements and usability tests, conducted during test deployment. The results confirm usability potential of the service, giving green light for further research and development. Still, in order to make the service reliable, the algorithms used to determine location and detect falls need to be improved. The article presents a method, which may be used to improve the location accuracy. Povzetek: Prispevek opisuje metode za pomoč starejšim, predvsem lociranje s pomočjo mobilnega telefona. 1 Introduction The societies of highly-developed countries are gradually becoming older, and therefore the phenomenon of elderly people going missing becomes noticeable [1]; the main reason being health issues, such as memory losses and spatial orientation problems. Additionally, elderly people are more likely to lose consciousness and fall due to their health problems - such situations always require instant reaction and often hospitalization. Rapid response is not always possible, especially if the location of the person is unknown. Therefore, the main idea behind the proposed service is to provide its users with a reliable and fast-to use location service that could be easily - or even automatically - invoked in case of emergency, without the need to carry additional electronic equipment. 2 Existing Solutions At the moment, there are numerous GPS-based location systems available, that can be used in medical assistance, such as Finder On-Line [2]. Those solutions are solely designed to work outdoors, where GPS signal is available. Other systems, like ZUPS [3] are designed for indoor location and require dedicated devices and infrastructure. Since mobile network cell-based location outperforms both solutions in range and reliability, it stands out as an interesting area of research. Despite its lower accuracy it has the additional advantage of very low cost. In this paper we investigate the functionality of a simple emergency location system built upon cellular network infrastructure. 3 The Idea of Telco 2.0 and Telco Web Services In the last years the Internet has gone through major changes. The idea of Web 2.0 has transformed the way in which the network is used and perceived. In the days of Web 1.0, the typical Internet user was mainly a passive consumer of the content such as web portals. Possible user activities were not related to the then-static World Wide Web and limited to sending e-mails, participating in chats or newsgroups. At the moment, Internet users have numerous possibilities of dynamically creating their own content: participating in social networking sites, writing blogs, collaborating on wikis and building web sites using content mashup from other pages and portals [4], [5]. Telecom operators, seeing the immense potential behind the Web 2.0, have aggressively tried to implement a similar, two-sided business models, based on service exposure platforms [6], [7]. Their goal was to monetize existing network assets more efficiently by leveraging third party developers and service providers. This concept is currently known as Telco 2.0 and is actively researched, resulting in numerous new applications like [8],[9],[10],[11],[12]. In the Telco 1.0 model, telecommunication networks are closed for external entities and only the operator is able to create services and telecom applications. In the Telco 2.0 model, operator's networks functionalities are made available for external developers by exposing sets of interfaces in the Internet. This approach allows companies and universities to build, test and deploy their own services based on telecom infrastructure. From the practical point of view, the most significant difference between Telco 1.0 and 2.0 is the way in which telecom resources are accessed. In Telco 2.0 it is done using the Web Services technology (Fig. 1), which is predominant in the IT sector, as opposed to telecom network- specific protocol stack used in Telco 1.0. Through Telco Web Services, Telco 2.0 implementation supports the most popular access models such as RESTful architecture style (Representational State Transfer - most popular in the Internet [13]) and SOAP (Simple Object Access Protocol), in accordance with SOA (Service-Oriented Architecture) guidelines - a "de facto" standard in the enterprise sector [14]. In comparison with the traditional way, use of Telco 2.0 interfaces allowed for a significant reduction in time required to develop a service. Therefore, as our goal was to confirm whether it was possible to build a usable emergency location service upon mobile network infrastructure, Telco Web Services was chosen as an optimal solution. The key operator's network element responsible for communication APIs exposition is the Service Delivery Platform (SDP). South interfaces of this system are connected directly to the network enablers, responsible for telecommunication functions and using telecom oriented specific binary protocols, such as: SMPP (Short Message Peer-to-Peer) and UCP (Universal Computer Protocol) for SMS services; MAP (Mobile Application Part of SS7 Stack) for USSD (Unstructured Supplementary Service Data) or MM7 interface for MMS messages. Because of binary character and specific telecommunication function oriented implementation, these protocols are difficult to use for (mostly) IT oriented programmers. North interfaces of SDP are connected to the Internet. Exposed APIs provide the developers with more user-friendly interfaces in Web Services form. First implementation of WS, dedicated for exposition of communication APIs, was implemented in SOA model as ParlayX standard [15]. In the last years we have observed an expansion of RESTful Web Services in the Internet. In response to this trends, the newest telecommunication APIs specification are resource oriented (e.g. OneAPI standard [16]). 4 The Emergency Button Service The majority of emergency information and fall detection systems require specifically designed hardware and software, which limits the commercial availability to the wealthiest users. This paper proposes a low priced system that uses reliable, ubiquitous technology - mobile phones that most people carry every day. Every cell phone is suitable to activate the basic service, and using a slightly more expensive smartphone significantly boosts its functionality. In spite of the above, as typical end users for the service we have chosen people with orientation disorders, memory losses or in danger of losing consciousness, their families or people in any way responsible for their wellbeing (social workers, nursing homes etc.). As will be described later - due to its nature, use of the service can be tailored to fit any situation where base-station location is of enough accuracy. 5 System Architecture In this chapter, service activation is presented as a way to establish interaction between the Seniors and Guardians. Seniors are the people who require attention due to their health issues. They will be the ones to invoke the service (intentionally - when lost, or automatically - when a fall is detected). Guardians, on the other hand, are those to be informed about senior's location in case of emergency. When the Emergency Button service is invoked, a message containing approximate GPS coordinates and address is sent to Guardian's cell phone. ( Figure 2). Emergency Button pressed i l|? I or Coordinates and address 1 fall detected (v'a SMS) Guardian's cell phone Figure 2: Service invocation scenarios Source: [10]. As stated before, Emergency Button may be invoked in numerous ways. The most basic one is by sending a USSD (Unstructured Supplementary Service Data) Figure 1: Telco Web Services architecture. Source: [10]. E_HEALTH ORIENTED APPLICATION FOR... Informatica 36 (2012) 341-346 message. Specific code can be stored in phone's memory or SIM card and assigned a speed dial button for the ease of access. Change sentence to: Besides, a full-screen widget for Android smartphones was developed. Therefore, the EB Service can be activated by simply touching almost anywhere in the screen while the phone is unlocked. The most prospective way to invoke the service, that was implemented, is the EB Fall. It is an Android application that controls a background service (not to be confused with EB service itself, as it is, in simple words, an application without a user interface running in the background). It is responsible for detecting a fall caused by losing consciousness by the owner using data from built-in accelerometer. If the Senior does not respond within a given period of time, the EB service is activated. The implemented heuristic model of a fall is based on measurements and findings presented in [17] and [18]. In order to implement the service, the following Telco 2.0 API functionalities were used: • Send USSD - for invoking the service from Senior's cell phone • Terminal Location - for determining Senior's cell phone location by means of cell identification • Send SMS - to inform the Guardian about an emergency situation. As shown in Fig. 3, the main component of the developed service is the application server, where main software components are deployed. The first one is responsible for running the logic of the service -receiving USSD messages, location and sending SMS messages to Guardians. It cooperates with the second module, designed to maintain communication with the database and process incoming requests. The last module is the graphical user interface - a web page allowing the Guardians to register in the service, add Seniors and maintain associations between them and registered Seniors, as well as view recent service activations on a map. -6 UTRAN/GERAN Senior's cell phone Figure 3: Structure of the developed service. Source [10]. 6 Measurements In order to check the accuracy of location returned by the mobile network, measurements in 60 random locations were taken in Warsaw area - using both GSM and 3G (UMTS) Radio Access Networks (RAN). As reference position we used data from an external GPS device. To assess reliability and actual usability, 7 tests were conducted in order to determine whether it was possible to find a lost person without using measures different than the EB service. 6.1 Accuracy tests The histograms represent the distribution of location error for both the GSM mode (Fig. 4) and the UMTS mode (Fig. 5). Obtained accuracy was higher in the GSM mode. The reason behind those results is that in Warsaw, GSM cells of the mobile network in use (Orange) were significantly smaller than the UMTS cells [19], Location error[m] Figure 4: Error of location prediction for the GSM RAN. Source: [10]. Since every mobile carrier's RAN structure is different (ex. an operator might maintain only one type of access network, have higher base station location density etc.), it cannot be stated with certainty that a specific type of network allows for better accuracy. Chances are that if a UMTS network was the only type of RAN maintained by an operator, its performance in means of location error could be much better due to UMTS networks characteristics (relatively small cells to provide good HSPA coverage [20], wide use of picocells [21] adding capacity in areas with dense phone usage). Figure 5: Error of location prediction for the UMTS RAN. Source: [10]. 6.2 Usability tests The results of usability tests were obtained in accordance with the following rules: • one person (Senior) had to choose a random place in Warsaw to invoke the service and then wait for rescue, • the other tester (Guardian) had to find the Senior wasting as little time as possible, • Guardian was only allowed to use means of public transport, • the service could be invoked only once per test, • no way of communication between the Guardian and the Senior was allowed other than activating the service. All the measured distances are "as the crow flies" - the length of a straight line segment connecting the points is given. Distance Senior Guardian- location error Time to Nb. Senior [m] [m] find [min] 1 1655 198 19 2 3153 210 44 3 1092 125 15 4 377 686 n/a 5 852 377 48 6 3710 290 35 7 410 144 41 Table 1: Results of the usability tests. Source: own research. Usability tests (tab. 1) show that if the location error is smaller than 200 m, finding a lost person in a very short period of time is possible, like in tests 1 and 3. Still, even if the error is relatively small, it might take a long time if the conditions are unfavorable, like in tests 2 (university campus) and 7 (Warsaw Metro construction site). When an emergency situation occurs in a densely built up residential area and the location error is higher than 300 meters, finding the Senior is still possible, but requires a very methodical, lengthy search (like in test 5), as the area to be explored grows quadratically with the error. As it increases, and the person in need is not in within sight, time needed to find them increases dramatically, up to a point when doing so in a reasonably short time is almost impossible (test 4). Therefore, small error (less than 200 m) does not necessarily guarantee a short search time, but large (more than 200 m) always results in a lengthy walk. 7 Challenges Using a mobile phone for location and fall detection has numerous advantages over specialized systems, the main being its low cost and the fact, that it is already considered an indispensable device by most people and is therefore, carried on a daily basis. Still, there are numerous weaknesses to the EB service that need to be taken into account and addressed during future development. As opposed to GPS-based systems, proposed service works indoors as well as outdoors and is relatively invulnerable to difficult weather conditions. Unfortunately, due to relatively large size of cells in cellular networks, if cell identification alone is used, location errors may be up to several kilometers in sparsely populated areas. This issue might be resolved by implementing a client-side GPS support or triangulation algorithms that would process data from Senior's phone such as signal strength and neighboring base stations. This would require phone-specific software to run on Senior's mobile phone as well as a reliable data connection, making the service more precise, but considerably less universal. In order to improve the handset location accuracy without any client-side support, the operator should implement more advanced network-based tracking techniques. It is possible to determine the sector in which the mobile phone resides -and estimate the distance to the current station by measuring radio signal propagation time delay [22]. Moreover, triangulation techniques can be used to determine position by using data concerning signal parameters from neighboring base stations. Methods dedicated for terminal location accuracy improvement are discussed in chapter 8. Since the service is intended to be used in case of loses of consciousness as well, improving the reliability of the implemented fall detection algorithm is a matter of utmost importance. The greatest challenge is the reduction in number of false positives while maintaining high sensitivity to real falls. In its current form, the service strictly relies on the interfaces provided by the mobile network operator. Consequently, as not all of the cellular operators expose necessary Telco 2.0 interfaces, it is only possible to use it within one network. Hopefully this matter will be resolved by the operators themselves by exposing proper APIs and providing with functional inter-operator links for USSD messages and location services. Another concern is the possibility of confidentiality breaches that applies to all location-based services. Subscriber's location and movement data is controlled and owned by mobile network operators. Since it has the potential to be used in adversary purposes, it may only be revealed under strict conditions that are defined by telecommunications law of a given country [23]. Therefore, before any commercial deployment, legal precautions need to be taken in order to prevent any misuse of the data provided. 8 Terminal Location Accuracy Improvement Methods The simplest method used in Location Based Services for mobile terminal location is based on network CelID, sometimes extended by radius calculation between BTS E HEALTH ORIENTED APPLICATION FOR. Informatica 36 (2012) 341-346 and mobile station defined by Timing Advance (TA) parameter. In this method, the approximate longitude and latitude of mobile terminal are calculated by GMLC (ang. Gateway Mobile Location Centre) based on geographical center of mobile cell. This method is dedicated for GSM system and is currently used by the Emergency Button application. Another mobile terminal location methods in this area are presented bellow [25]: • U-TDOA (Uplink Time Difference of Arrival) -dedicated for GSM and UMTS, based on the delay of radio signal propagation from terminal to network, • E-OTD (Enhanced Observed Time Difference) -standard hybrid method for GSM conceptually, method is similar to U-TDOA. • OTDOA (Observed Time Difference Of Arrival) -standard defined in UMTS system, based on measurement of delay radio propagation signal from network to terminal. OTDOA is based on the measurement of signal delay from minimum three base stations [25],[26],[27]. Using this method, location error can be minimized to 50 m in the cites. Methods presented above, based on signal propagation time measurement (U-TDOA, E-OTD OTDOA) can be implemented by installing specific hardware and software on provider's side. Another approach for reducing location error is focused on methods dedicated for implementation on mobile terminal side. Standard method in this area is A-GPS (Assisted GPS) - typical mobile terminal location method using assisted GPS receiver installed in mobile terminal (dedicated for UMTS). In last years we cloud observe rapid development of hybrid methods. This kind of location services is based on observation of information from different sources: • Wireless Network SSID presence, • Network Measurement Report (signal strength from up to 6 Basic Transceiver Station) observed by mobile terminal [27],[29]. • GPS, Accelerometer and gyroscope. Information from all the sources presented above is correlated with existing database and utilized to support the calculation of mobile station position. Unfortunately, hybrid methods are available only for users with smartphones and tablets because of dedicated hardware requirements (WiFi, accelerometer, gyroscope, GPS). An example API based on hybrid methods is described in [30] and is defined as common source of location information (GPS and location inferred from network signals such as IP address, RFID, WiFi and Bluetooth, MAC addresses, and mobile networks cell IDs). Method country suburban city Inside the buildings Cell ID 1-35km 1-10km 50m-1km 50m-1km Cell ID +TA 1-35km 1-10km 50m-1km 50m-1km U-TDOA OTDOA 80m 50m 50m 50m E-OTD 50-150m 50-150m 50-150m 50-150m A-GPS 30m 40m 30-150m 200m Table 2: Comparison the accuracy of methods dedicated for mobile terminal location [25]. 9 Conclusion Presented system is a viable solution to low-cost emergency location. Using existing technologies and simplest cell-identification based location, we proved that it is possible to find a lost person in a densely populated area with only the data from the received text message. Further improvement in accuracy and better reliability can be obtained if the operators decide to better their handset tracking technologies. It is probable, since by exposing Telco 2.0 interfaces they received an easy-to-use way of providing location services to external entities. If that is not the case, better accuracy can be obtained by using less universal client-side solutions. Future work ought to focus on improving the accuracy of returned location, which is crucial in reducing time necessary for finding a lost person. Another problem that needs to be addressed is the limited reliability of the implemented fall detection algorithm. Despite presented use-case, the service can be easily adapted to different emergency situations, e.g. informing municipal police about dangerous situations in means of public transportation or finding lost children. References [1] M. J. Patterson: Who is missing? A study of missing persons in B.C., Simon Fraser University, 2005 [2] Finder Telematic Solutions, available: http://www.finder.pl/ [3] A. Marco, R. Casas, J. L. Falco, H. J. Gracia, J. I. Artigas, A. Roy: Location-based services for elderly and disabled people, Computer Communications Journal, April 2008 [4] N. Baneijee, K. Dasgupta, Telecom Mashups: Enabling Web 2.0 for Telecom Services, ICUIMC '08 Proceedings of the 2nd international conference on Ubiquitous information management and communication, Korea, 2008 [5] S. A. Ahson, M. Ilyas, Service Delivery Platforms, Developing and Deploying Converged Multimedia Services, CRC Press Taylor Francis Group, 2011 [6] M. Sredniawa, Telecommunications Reinvented, Proceedings of the XIV Poznan Telecommunications Workshop, Poznan 2010 [7] portal STL Partners http://www.telco2.net/ [7.11.2012] [8] D. Bogusz, A. Podziewski, K. Litwiniuk, J. Legierski ., Telco 2.0 for UC - an example of integration telecommunications service provider's SDP with enterprise UC system, Conference FedCSIS/FINANS, Wroclaw, 2012, IEEE Explore [9] Litwiniuk K., Czarnecki T., Grabowski S., Legierski J., Bus Stop - Telco 2.0 application supporting public transport in agglomerations, Conference FedCSIS/FINANS, Wroclaw, 2012, IEEE Explore [10] Podziewski A., Litwiniuk K., Legierski J., Emergency Button - a Telco 2.0 application in the e-health environment, Conference FedCSIS/FINANS, Wroclaw, 2012, IEEE Explore [11] H. Rosa, D. Krasinska, USSD Survey Guide, Orange Labs, 2011 [12] A. Tylman, J. Jankowski, Assumptions and scope of Work Control - trial service based on Parlay X technology, Orange Labs, 2010 [13] L. Richardson, Sam Ruby, David Heinemeier Hansson, RESTful Web Services, O'Reilly, 2007 [14] E. Newcomer, Understanding Web Services: XML, WSDL, SOAP, and UDDI, Independent Technology Guides, 2003 [15] Open Service Access (OSA); Parlay X web services, 3GPP Technical Specification TS 29.19901 - TS 29.199-22 [16] Portal OneAPI http://www.oneapi.gsma.com/ [7.11.2012] [17] F. Sposaro, G. Tyson, iFall: An Android Application for Fall Monitoring and Response [18] T. Zhang, J. Wang, P. Liu, and J. Hou: Fall detection by embedding an accelerometer in cellphone and using kfd algorithm. IJCSNS International Journal of Computer Science and Network Security, 6(10), October 2006. [19] Orange RAN map, available: http://mapa.btsearch.pl/ [20] D. Maidment: Understanding HSDPA's Implementation Challenges, picoChip Designs 2005 [21] R. Kumar: A picocell primer, Texas Instruments 2006 [22] M. O. Sunay, I. Tekin: Mobile location tracking in DS CDMA networks using forward link time difference of arrival and its application to zone-based billing, Global Telecommunications Conference, 1999. GLOBECOM '99 [23] L. Ackerman, J. Kempf, T. Miki: Wireless Location Privacy: Law and Policy in the U.S., EU and Japan [24] Portal Orange Labs Telco 2.0 University www.tu.rd.tp.pl [25] J. Stefanski. Metody i standardy pozycjonowania terminali w systemach komorkowych Przegl^d Telekomunikacyjny, nr 6, 2006, Wydawnictwo Sigma-NOT, Warszawa, Poland [26] M. Miernik, Metody i procedury lokalizacji abonenta w sieciach komorkowych GSM2+/3G, Przegl^d Telekomunikacyjny 5/2003, Wydawnictwo Sigma-NOT, Warszawa, Poland [27] J. Stefanski, Radio Link Measurement Methodology for Location Service Applications, Metrology and Measurement Systems, vol. XIX, no. 2 [28] P. Korbel, P. Wawrzyniak, P. P^tek, J. Legierski, NMR Recorder- narz^dzie do gromadzenia informacji pomiarowych z terminala komorkowego, KKRRIT 2012, Przegl^d Telekomunikacyjny 4/2012 [29] B. Zacharuk, A. Tylman, P. P^tek, S. Grabowski J. Legierski, NMR API - nowy interfejs programistyczny w modelu Telco 2.0 i propozycje jego zastosowan, KKRRiT 2012, Przegl^d Telekomunikacyjny 4/2012 [30] Geolocation API http://www.w3.org/TR/ geolocation-API/ [7.11.2012] Artificial Immune System Inspired Intrusion Detection System Using Genetic Algorithm Amira Sayed A. Aziz French University in Egypt, Scientific Research Group in Egypt (SRGE), Cairo, Egypt E-mail: amiraabdelaziz@gmail.com and www.egyptscience.net Mostafa A. Salama British University in Egypt, Scientific Research Group in Egypt (SRGE), Cairo, Egypt E-mail: mostafa.salama@gmail.com and www.egyptscience.net Aboul ella Hassanien Faculty of Computers and Information, Cairo University, Chairman of Scientific Research Group in Egypt (SRGE), Cairo, Egypt www.egyptscience.net Sanaa El-Ola Hanafi Faculty of Computers and Information, Cairo University, Cairo, Egypt Keywords: artificial immune system, intrusion detection, genetic algorithm, Minkowski distance Received: October 11, 2012 Computer security is an issue that will always be under investigation as intruders never stop to find ways to access data and network resources. Researches try to find functions and approaches that would increase chances to detect attacks and at the same time would be less expensive, regarding time and space. In this paper, an approach is applied to detect anomalous activity in the network, using detectors generated by the genetic algorithm. The Minkowski distance function is tested versus the Euclidean distance for the detection process. It is shown that it Minkowski distance give better results than the Euclidean distance, and can give very good results using less time. Itgives an overall average detection rate of 81.74% against 77.44% with the Euclidean distance. In addition, formal concept analysis was applied on the data set containing only the selected features and used to visualize correlation between highly effective features. Povzetek: Predstavljena je varnostna metoda na osnovi umetnega imunskega sistema. 1 Introduction Anomaly detection has been a widely researched problem in several application domains such as system health management, intrusion detection, health-care, bio-informatics, fraud detection, and mechanical fault detection. Traditional anomaly detection techniques analyse each data instance (as a uni-variate or multivariate record) independently. And ignore the sequential aspect of the data. Often, anomalies in sequences can be detected only by analysing data instances together as a sequence, and hence cannot be detected by traditional anomaly techniques [1]. Gonzalez and Dasgupta in [2] used sequential niching technique with the genetic algorithm to generate the rules. Then, in [3] they suggested using deterministic-crowding niching technique to limit the crowd by replacing parents with more fitted children. This time, the algorithm gave same results with less number of rules, which is better because the population size will not change. This paper applies an approach for detecting network traffic anomalies using genetic algorithm based intrusion detection system, but without the levels of abnormality. The algorithm is put under investigation to find which values for its parameters can lead to better results, using the relatively new NSL-KDD data set. The rest of this paper is organized as follows. Section 2 presents a background of anomaly intrusion detection, artificial immune systems, genetic algorithms and formal concept analysis. Section 3 gives a review on work similar to the one mentioned in this paper. Section 4 gives a description of the applied approach and its phases as well. Section 5 shows the experimental results and discusses observations. Finally, section 6 addresses the conclusions. 2 Background 2.1 Anomaly Detection An Intrusion Detection System (IDS) is a system built to detect outside and inside intruders to an environment by collecting and analysing its behaviour data. In earlier times, system administrators were detecting intrusions manually. They did that by noticing anomalous actions Intrusion Architecture Hybrid Detection Hierarchical Systems Network Approach Misuse Anomaly Response Active Passive Structure Centralizaed Distributed Placement Host Network Hybrid Data Audit Trail Network Packets Figure 1: Intrusion Detection Systems classification then by monitoring audit logs, which was during 70's and 80's. The problem was that suspicious or anomalous actions were discovered after that have took place. Another problem was that audit logs were stacked by lots of activities which was a burden to view and would need a lot of time to review and high expertise to notice suspicious behavioural pattern. So, the need for real-time systems that can detect such activities while they happen emerged. By the 90's, IDSs were created to review audit data while they build up, and by time, they were developed to take actions as responses to attacks [4]. IDSs can be categorized in many terms [5], all categories are summarized in Figure 1. Misuse-based and Anomaly-based detection are two basic approaches are followed to implement an IDS. In a misuse-based IDS, attacks are represented as a pattern or a signature to use for detection. It's very good in detecting known attacks and provide detailed information on the detected ones, but is of little use for unknown attacks. Anomaly-based IDS build a model for a system's normal behaviour to use for detection, assuming all deviated activities to be anomalous or intrusions. It is very useful for finding unknown attacks but it has a high false negative or positive rates, beside it needs to be updated with system behaviour and can not provide much information on detected attacks. In some IDSs, a hybrid of both techniques is used [6]. Different approaches exist for Anomaly-based Network IDS (A-NIDS), but in general they all consist of the following modules: (1) Parametrization: representing the observed instances in some pre-defined form, (2) Training: a model is built using the normal or abnormal system behaviour. It can be done manually or automatically, and (3) Detection: the (parametrized) monitored traffic is searched for anomalous behaviour using the system model built through previous stage. The techniques used to build the system behavioural model can be: statistical, knowledge-based, or machine learning-based. The Genetic Algorithms (GA) is among the machine learning-based techniques. The flexible and robust global search is the main advantage of applying GAs in A-NIDS, where it looks for a solution from multiple directions with no prior knowledge required about the system [7, 8]. 2.2 Genetic Algorithms Genetic Algorithms (GAs) is one of the Evolutionary Computation (EC) approaches. In general,EC can be involved in many tasks in IDSs, such as optimization, automatic model design, and in classification [9]. GAs are basically used in IDSs to generate rules (build a classifier) used to detect anomalies [8]. They were inspired by the biological evolution (development), natural selection, and genetic recombination. GAs use data as chromosomes that evolve through: selection (usually random selection), cross-over (recombination to produce new chromosomes), and mutation operators. Finally, a fitness function is applied to select the best (highly-fitted) individuals. The process is repeated for a number of generations until reaching the individual (or group of individuals) that closely meet the desired condition [8, 2]. GA is very promising in the computer security field, especially in IDSs. It has been applied for intrusion detection since the 1990's, and still being used up till the current time. GA is usually used to generate rules for intrusion detection, and they usually take the form if {condition} then {action}, where the condition part test the fields of incoming network connections to detect the anomalous ones [8]. Niching techniques are known to assist EC techniques to find multiple local optimal solutions in a population by creating sub-populations which assemble local optima so there would be diversity in the population [9]. They can be used in conjunction with GAs to find multiple solutions in one round without the need to run the GA multiple times. 2.3 Artificial Immune Systems The Artificial Immune Systems (AIS) were inspired by the Human Immune System which is robust, decentralized, error tolerant, and adaptive. The HIS has different cells with so many different tasks, so the resultant mimic algorithms give differing levels of complexity and can accomplish a range of tasks. There are a number of AIS models used in pattern recognition, fault detection, computer security, and a variety of other applications in the field of science and engineering. Most of these models emphasize on designing and applying computational algorithms and techniques using simplified models of various immunological processes and functionalities [10, 11]. There exists no single algorithm from which all immune algorithms are derived, as AISs are designed using a number of algorithms [12]. The Negative Selection approach (NSA) explains how T-cells are being selected and their maturation in the system. T-cells are blood cells that belong to a group of white blood cells called lymphocytes. In the NSA, whenever the T-Cells are produced, they undergo an immaturely period to learn which antigen recognition results in their death. The T-cells need activation to develop the ability to remove pathogens. They are exposed to a comprehensive sample of self antigens, then they are tested against self and non-self antigens to match the non-self ones. If a T-Cell matched a self antigen, it is then removed until they are mature and released to the system [13, 14]. 2.4 Formal Concept Analysis Formal Concept Analysis (FCA) is one of the data mining research methods and it has been applied in many fields as medicine. The basic structure of FCA is the formal context which is a binary-relation between a set of objects and a set of attributes. The formal context is based on the ordinary set, whose elements has one of two values, 0 or 1 [15], [16]. A formal concept is defined as a pair(A, B) with A C G,B C M, intent(A)=B and extent(B) = A. The set A is called the extent and the set B called the intent of the concept (A, B). The extent and the intent are derived by two functions, which are defined as: intent (A) ={m G M \ig G A : (g,m) G I}, A C G, extent (B) ={g G G\im G B : (g,m) G I}, B CM. (1) (2) Usually the attributes of a real life data set are not in a binary form, attributes could be expressed in a many-valued form that are either discrete or continuous values. In that case the many-valued context will take the form (G, M, V, I) which is composed of a set G of objects, a set M of attributes, a set V of attribute values and a ternary-relation I between G, M and V. Then the many-valued context of each attribute is transformed to a formal concepts, the process of creating single-valued contexts from a many-valued data set is called conceptual scaling. The process of creating a conceptual scale must be performed by using expert knowledge from the domain from which the data is drawn. Often these conceptual scales are created by hand, along with their concept lattice, since they are represented by formal contexts often laid out by hand. Such that a threshold t is chosen for each many-valued attribute and replace it by the two one-valued attributes "expression value" [15], [16]. 3 Related Work Many researches combined GAs with IDSs either for optimization or to build classifiers for the intrusion detection process. Dasgupta and Gonzalez have done some research concerning AIS-inspired network security systems [2,3, 8]. In [2] they built a model applying the Positive Characterization (PC) concept which follows the NSA algorithm, where a model is built representing the self space and characterize the connections (as normal or anomalous) according to their distance to that self model. They also implemented another algorithm that applies the Negative Characterization (NC) concept which builds a model for the non-self space and use it to detect attacks. Both algorithms used GA with sequential Niching algorithm to generate the rules used to define the models. Real-valued variables were used instead of binary encoding, so the model is representing self/non-self samples in the hyperspace and the detectors cover that complementary space. They concluded that PC gives more precise results than NC but NC requires less time and space resources. In [3] they implemented and algorithm to build a model representing the self space for anomaly detection too. They used a variability parameter to defines levels of abnormality. Again, GA was used to generate the detectors but this time using the deterministic-crowding Niching technique. Their new technique had better computational power and showed very good results detecting the attacks. They used the Darpa intrusion detection evaluation data set. In [8], they implemented a Rule-based system (RBS) by creating artificial intelligence rules using GAs for intrusion detection. They followed NC where detectors are generated to match anomalous connections. They used the hyperspace fitness function originally suggested by Gonzalez and Dasgupta. Wei Li and Iss Traore [17] proposed a rule evolution approach based on GA, but they used parse trees to represent population instead of chromosomes. They used the Darpa data set for evaluation. In [18] GA was used to generate a set of rules where each rules identifies a particular attack type. As a result to their experiment, they generated a set of six rules that classify six different attack types that fall into two classes: DoS and probe. They used the following fitness function: F = A B (3) with threshold 0.95. Pillai et al. in [19] also implemented a NIDS using GA to create rules automatically for specific connections and they used real network data for evaluation. McFadden [20] created a similar system but used a different fitness function. It depends on the degree of matching between the fields values and the suspected fields with predefined weights for each field. Then a penalty is calculated based on the matching measures and the ranking. He used JGAP - which is an open source Java based GA framework - to develop the system. In [21] they focused on feature selection to reduce the number of features used in the intrusion detection. They used the mutual information to define relation between decision variable X and connection feature variable Y. In other words, they were looking into the amount of information about connection type contained in each connection feature. Fan Li in [22] proposed an intelligent IDS which combines both anomaly and misuse techniques. GA is used for the fuzzy logic in the learning component of system, to tune the fuzzy membership functions and to select an appropriate set of features. Other work involving the use of GAs for intrusion detection can be found in [23], [24], [25], and [26]. Also, [27] gives a detailed survey on such systems. 4 The Proposed Network Anomaly Detection Approach Genetic Algorithms produce the best individual as a solution, but in an A-NIDS a set of rules is needed - hence, running GA multiple times. The technique used here was originally proposed in [28], where an algorithm was implemented to generate detectors for network anomaly intrusion detection, using GA with the deterministic-crowding Niching technique. The strengths of the deterministic-crowding Niching technique are that it requires no additional parameters to those that are already used in a GA, beside that it is fast and simple [29]. The self (normal behaviour) individuals are represented in a self space S, where each individual is represented as a vector of features of dimension n, with the values normalized to the interval [0.0,1.0]. This can be written as 5 = x\,..xm, where m is the number of the self samples. Algorithm (1) shows the main steps of the detectors generation approach. The final solution was a set of rules, represented as individuals with low and high limits for each dimension, as the conditions used to define the AIS detectors. So, each rule R has a condition part (xn G [lowi, highi]), hence a feature vector xi satisfies a rule R if its hyper-sphere intercepts the hyper-cube represented by the rules defines by its points [3]. To calculate the fitness of an individual (or a rule), two things are to be taken into account: the number of elements in the training sample that can be included in a rule's hyper-cube, calculated as [2, 3]: num_elements(R) = x1 G S and x1 G R (4) The volume of the hyper-cube that the rule represents is defined by the following form: n volume(R) = JJ^ (highi — lowi) (5) (i=0) Consequently, the fitness is calculated using the following equation. fitness(R) = volume(R) — C x num_elements(R) (6) where C is a coefficient of sensitivity that represents a penalty if a rule covers anomaly samples. The bigger the Algorithm 1 Detectors generation algorithm Initialize population by selecting random individuals from the space S. for The specified number of generations do for The size of the population do Select two individuals (with uniform probability) as parent i and parent2. Apply crossover to produce a new individual (child). Apply mutation to child. Calculate the distance between child and parentl as dl, and the distance between child and parent2 as d2. Calculate the fitness of child, parentl, and parent2 as f, fl, and f2 respectively. if (dl < d2) and (f > fl) then replace parentl with child else if (d,2 <= di) and (f > f2) then Replace parent2 with child. end if end if end for end for Extract the best (highly-fitted) individuals as your final solution. C value, the higher the sensitivity - hence the penalty - is. The fitness can take negative values. The same equations are used if you're calculating the fitness of an individual in general. To calculate the distance between two individuals (a child c and a parent p), volumes of the hyper-cubes surrounding the individuals (represented by low and high points in each dimension) are used as follows: distance(cp) = volume(p — volume(p PI c) (7) volume(p) The purpose of using the volume is to check how much the child covers of the area of the parent, so this distance measure is not symmetric. The algorithm had very good results (compared to some others as will be shown later in the paper) with the highest detection rate 81.76%. The Euclidean distance was used in the comparison and detection process. In this paper, the use of other distance metrics is proposed — precisely the Minkowski distance. It is a general metric in which other metrics can be included within as special cases of its form [30] [31]. The Minkowski distance between two vectors X and Y can be expressed as, n d(X,Y ) = (J2(\xi — Vi\P))l/p (8) i=0 where p is the Minkowski metric order, and it can take values from 0 to infinity (and can even be a real value between 0 and 1). If p=1 then it is the Manhattan distance, if p=2, then it is the Euclidean distance, and as it approaches infinity it becomes the maximum (or Chebyshev) distance. 5 Experimental Results and Discussion 5.1 Data Sets The experiment was performed on the NSL-KDD data set which was suggested to solve some problems in the KDD Cup'99 data set that is widely used for IDS evaluation. This data set contains less number of records in both the train and the test, which helps researchers to run their experiments on the whole sets instead of only small portions. Hence, the evaluation results will be comparable and consistent [32]. Fifteen parameters (features) were selected to use in the experiment, which have real values that can be used in the approach and can be used to detect basic DoS attacks. These features values are already in the interval [0.0,1.0], and they are mentioned in Table (1) in [28]. 5.2 Experiment Settings The self (normal) data only was used in the training phase to generate best rules that represent the Self profile, as the negative selection approach suggests, then the rules were compared against the test sample. The parameters values used for the genetic algorithm are mentioned below in table (1). Variable Value Population Size Number of Generations Mutation Rate Sensitivity Coefficient Variability Value p (Minkowski Order) 200, 400, 600 200, 500, 1000, 2000 0.1 1.0 0.05, 0.10,015, 0.20 0.5 llnon_self (x) = D(X, Self) = min{d(x, s) : s G Self} (9) Test Set - Avg. Detection Rates vs. Variation Levels flTjtr^r i a u r ■ □ 200, 200, 200, 200, 400, 400, 400, 400, 500, 600, 600, 600, 200 500 10Q0 2000 200 500 1000 20Q0 200 500 1000 2000 population size, number of generations Figure 2: Euclidean Distance Average Detection Rates versus Threshold Levels. Test Set - Avg. Detection Rates vs. Variation Levels urt ijir Table 1: The classification accuracy of known classifiers Following the NSA, the algorithm is basically trained (to generate rules) on self (normal) samples, then use these rules to find non-self (anomalies) which will be the vectors very far from the self rules. To characterize the samples to self or non-self, the characterization function was: which mean the closer a vector x is to a self point s, the less it is a non-self sample. The distance measure d(x, s), is the Minkowski distance as mentioned above. 5.3 Experiment Results The training phase was held using normal samples extracted from the 20% Train Set to generate intrusion de- Figure 3: Minkowski Distance Average Detection Rates versus Threshold Levels. tectors. Using different population sizes ran for different numbers of generations, the algorithm was executed for each combination, resulting in a group of detectors. Each detectors is expressed as values for each features, where the variability value defines the upper and lower bounds for that feature. Each set of detectors was tested against the Test Set of the NSL-KDD data set for detection of anomalies, once using the Euclidean distance, and another time using the Minkowski distance. Figures 2 and 3 show the average detection rates (regarding variation levels) using euclidean and minkowski distances respectively. It can be realized that — for all combinations — the minkowski distance give better detection results (most of all above 80%), with detectors generated by smaller populations giving better results. In Figures 4 and 5, the maximum detection rates obtained by euclidean and minkowski distances, respectively, are shown regarding the threshold levels. Figure 4 shows that the euclidean distance always give the best results with lower threshold of 0.2, and the detection rates lower while the threshold is higher. For the minkowski distance, using Test Set Max. Detetion Rates vs. Thresholds 111111111111 200, 200, 200, 200, 400, 400. 400, 400, 600, 600. 600, 600, 200 500 1000 20CQ 200 500 1000 2000 200 500 1000 2000 population size, number of generations Figure 4: Euclidean Distance Maximum Detection Rates versus Threshold Levels. Figure 6: Euclidean Distance Maximum Detection Rates versus Variation Levels. Figure 5: Minkowski Distance Maximum Detection Rates versus Threshold Levels. Figure 7: Minkowski Distance Maximum Detection Rates versus Variation Levels. higher threshold gives better results with detectors generated by smaller populations. Using lower threshold gave better results when used with detectors generated by bigger populations. Comparing the maximum rates regarding variation levels, they are shown in Figures 6 and 7 for euclidean and minkowski distances respectively. With the euclidean distance, detectors generated by less number of generations give better results with smaller variation levels. Higher variations levels are better for detectors generated by more number of generations. For the minkowski distance results, variation levels of 0.10 and 0.15 give higher detection rates with detectors generated by bigger population. But using less number of generations give better detection rates with lower variation levels (0.05 and 0.10). 5.4 Comparison analysis In [33], they ran the machine learning algorithms implemented in the project WEKA [34] against the NSL-KDD Test Set. The detection accuracy is listed in table (2) along with the suggested algorithm, and it shows that using the minkowski distance has very good detection accuracy compared to those approaches used Analysing the performance of the proposed approach for intrusion detection, evaluation measures are calculated, which are: true/false positives and negatives rates and shown in the following Figures. In Figures 8 and 9, we can realize that detectors generated by GA using bigger populations give higher True Negatives Rates (TNR) (and lower False Positives Rates (FPR)) than those generated using smaller population. Consequently, using smaller population result in higher True Positives Rates (TPR) and lower False Negatives Rates (FNR) than using bigger populations, as shown in Figures 10 and 11 respectively. Looking more into results regarding variation values (that define upper and lower limits of detectors conditions), high variation levels result in higher TNRs and lower FPRs with the detectors generated by bigger populations as realized in Figures 12 and 13. Figures 14 and 15 show that TPRs are higher (and FNRs are lower) with lower variation levels. Based on threshold values — mentioned in table Classifier Classification name accuracy j47 81.05% Naive Bayes 76.56% NBTree 82.03% Random Forest 80.67% Random Tree 81.59% Multi-layer Perception 77.41% SVM 68.52% Suggested Algorithm - Eu- 81.76% clidean Suggested Algorithm - 82.13 % Minkowski Table 2: The classification accuracy of known classifiers Figure 9: Minkowski Distance False Positives Rates. Figure 8: Minkowski Distance True Negatives Rates. Figure 10: Minkowski Distance True Positives Rates. (2) — used in the experiment, low threshold values help detect more anomalous activities, especially with low variation levels. Hence, as higher the threshold becomes, the higher the TNRs and the lower the TPRs, and it's all shown in Figures 16 to 19. 5.5 Visualization of the correlation between highly effective features The first step applied here, is the selection of the most important and effective features. A forward feature selection technique is applied on the input data set based on based on naive Bayesian tree classification technique. The used input data set contains 2000 records, with a balanced distribution between normal or abnormal system behaviour. For ensuring the effectiveness of the resulted accuracy, a 10 fold classification methodology is applied. The resulted classification accuracy after the selection of the features is 91.3%. The selected features sorted according their importance from higher to lower are: - dst_host_same_srv_rate, - dst_host_same_src_port_rate, - srv_serror_rate, - srv_diff_host_rate, - dst_host_srv_rerror_rate, - dst_host_srv_diff_host_rate In the second step, Formal Concept Analysis is applied on the data set containing only the selected features from the previous step. Figure 20 shows that attributes dst_host_same_srv_rate, dst_host_same_src_port_rate, srv_diff_host_rate, and dst_host_srv_diff_host_rate are highly correlated and represent the most effective features in the discrimination between normal and anomalous connections. 6 Conclusions and Future Work In this paper, the Minkowski distance function was applied to detect anomalies, against using the euclidean distance. The investigation was held using different values for the parameters used in the genetic algorithm to find those which can give better results. The system is basically an intrusion detection system which uses detectors generated by genetic False Negatives Rate 30% 25% 20% 15% 10% 5% 0% 2dd, 2dd, 200, 200, 400, 400, 400, 400, edo, edo, edo, edd, 200 500 1000 2000 200 500 1000 2000 2DD 5 DD 1000 2000 ■ Min. ■ fi.vg. Figure 11: Minkowski Distance False Negatives Rates. AverageTNR vs. Variation Level 93% 92% 200, 2DD, 2DD, 200, 400, 400, 400. 400, 600, 600. 600, 600, ZDD 5 DD 100D ZODD ZDD 5 DD 100D ZDDD ZDD 5DD 100D ZDDD ■ Variation levelO D5 ■ Variation levelO.lO «Variation LevelO.15 Varetion lev el 0.2D Figure 12: Minkowski Distance Average True Negatives Rates. algorithm combined with deterministic-crowding niching technique, applied on NSL-KDD IDS test data set under the scope of negative selection theory. The Minkowski order can be a small value (between 0 and 1) or a big value (up to infinity). Lower values of the order are aimed if one is interested in finding how much the objects are similar. So, a value of 0.5 was used in the experiment. With all values used within the GA, the Minkowski distance function gave better detection rates. Threshold values give very good results in different cases - use detectors generated by bigger populations with lower threshold values or use detectors generated by smaller populations with higher threshold values. Also, medium levels of variation are better used for best results (0.10 and 0.15). So, we recommend using smaller populations to generate detectors for: (1) taking less time to run, and (2) give less number of detectors hence, less time detecting anomalies against the detectors set. Finally, it's a matter of balancing between the IDS sensitivity (TPRs) and specificity (TNRs) that helps in the decision of which threshold and variability values to use for best results. Average F PR vs. Variation Level 2GD, ZDD, ZDD, 2D0, 4DG, 400, 400, 4D0, 600, 6DD, 6CD, 600, 200 500 1000 ZDDD 2DD 5 DC 1000 2000 2DD 500 100D 2000 ■ Varetion tevelD.D5 ■ Varetion levelD.lD ■ Varetion level D 15 ■ Variation LevelO.ZO Figure 13: Minkowski Distance Average False Positives Rates. Ave rage TP R vs. Variation Level ZDD, ZDD, ZDD, ZDD, 4DD, 400, 400, 4DD, 6DD, 6DD, 6DD, 6DD, 200 5DD 1000 ZDDD 2DD 5 DC 1000 2000 2DD 500 100D 2000 ■ Varetion tefel0.05 ■ Varetion level0.10 ■ Varetion level D 15 ■ Variation LevelO.ZO Figure 14: Minkowski Distance Average True Positive Rates. Average FNR vs. Variation Level 200, 2DD, 200, 200, 4DD, 400, 400, 4DD, 600, 600, 600, 600, 2DD 5GD 100D 2DD0 ZDO 5 DO 1DGD ZDDD ZDD 5 DD 1DDG 200D ■ Varetion levelO.D5 «Varetion levelD.lD ■ Varetion level D 15 ■ Varetion l^elO.ZD Figure 15: Minkowski Distance Average False Negatives Rates. Figure 16: Minkowski Distance Maximum True Positives Rates. Figure 19: Minkowski Distance Minimum False Positives Rates. Figure 17: Minkowski Distance Minimum False Negatives Rates. Figure 18: Minkowski Distance Maximum True Negatives Rates. Figure 20: Visualized correlation between highly effective features using FCA References [1] Varun Chandola (2009) Anomaly Detection for Symbolic Sequences and Time Series Data, PhD. Dissertation. Computer Science Department, University of Minnesota, http:// purl.umn.edu/56597. [2] Fabio A. Gonzalez and Dipankar Dasgupta (2002) An Immunity-based Technique to Characterize Intrusions in Computer Network, IEEE Transactions on Evolutionary Computation, Vol. 6(3), pp. 281-291. [3] Fabio A. Gonzalez and Dipankar Dasgupta, (2002) An Immunogenetic Technique to Detect Anomalies in Network Traffic Proceedings of the Genetic and Evolutionary Computation Conference, GECCO, Morgan Kaufman, pp.1081-1088. [4] R.A. Kemmerer and G. Vigna (2002) Intrusion Detection: A Brief History and Overview, IEEE Computer, Vol. 1(1), pp. 27 - 30. [5] Przemyslaw Kazienko and Piotr Dorosz (2004) Intrusion Detection Systems (IDS) Part 2 - Classification; methods; techniques", web white paper, http://www.windowsecurity.com/articles/ids-part2-classification-methods-techniques.html. [6] Tarek S. Sobh and Wael M. Mostafa (2011) A cooperative immunological approach for detecting network anomaly", Applied Soft Computing, Elsevier, Vol. 11(1), pp. 1275-1283. [7] P. Garcia Teodorro, J. Diaz-Verdejo, G. Marcia-Fernandez, E. Vazquez (2009) Anomaly-based network intrusion detection: Techniques, systems and challenges, Computers and Security, Elsevier, Vol. 28(1-2), pp.18-28. [8] Wei Li (2004) Using Genetic Algorithm for Network Intrusion Detection", Proceedings of the United States Department of Energy Cyber Security Grou, Training Conference, Vol. 8, pp. 24-27. [9] S.X. Wu and W. Banzhaf (2010) The use of computational intelligence in intrusion detection systems: A review, Applied Soft Computing, Vol. 10, pp. 1-35. [10] L.N. De Castro and J. Timmi (2002) Artificial Immune Systems: a new computational intelligence approach, Springer, Book Chapter, 1st Edition., XVIII, 380 p. [11] Dipanker Dasgupta (2006) Advances in Artificial Immune Systems, IEEE Computational Intelligence Magazine, Vol. 1(4), pp. 40-49. [12] U. Aickelin and D. Dasgupta (2003) Artificial Immune Systems, Book Chapter, Search Methodologies: Introductory Tutorials in optimization and decision support techniques, Springer, pp. 375-399. [13] Julie Greensmith, Amanda Whitbrook, Uwe Aickelin (2010) Artificial Immune Systems, Handbook of Metaheuristics, International Series in Operations Research and Management Science, Springer, Springer US, Vol. 146, pp. 421-448. [14] Dipankar Dasgupta, Senhua Yu, Fernando Nino (2011) Recent Advances in Artificial Immune Systems: Models and Applications, Applied Soft Computing,, Vol. 11(2), pp. 1574-1587. [15] Mehdi Kaytoue, Sčbastien Duplessis, Sergei O. Kuznetsov and Amedeo Napoli (2009) Two FCA-Based Methods for Mining Gen Expression Data", Lecture Notes in Computer Science, Vol. 5548, pp. 251-266. [16] Richard Cole, Peter Eklund, Don Walker (1998) Using Conceptual Scaling In Formal Concept Analysis For Knowledge And Data Discovery In Medical Texts", Proceedings of the Second Pacific Asian Conference on Knowledge Discovery and Data Mining, pp. 378-379. [17] Wei Li and Issa Traore (2004) Detecting New Forms of Network Intrusion Using Genetic Programming", Computational Intelligence, Vol. 20(3), pp. 475-494. [18] Anup Goyal, Chetan Kumar (2008) GA-NIDS: A Genetic Algorithm based Network Intrusion Detection System. [19] M. M. Pillai, Jan H. P. Eloff, H. S. Venter (2004) An Approach to implement Network Intrusion Detection System Using Genetic Algorithms", SAICSIT '04 Proceedings of the 2004 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countries, pp. 221-221. [20] McFadden (2008) Genetic Algorithms and Network Intrusion Detection", MBI 640 Data Communications & Network Security, Northern Kentucky University. [21] Hossein M. Shirazi and Kalaji Y (2010) An Intelligent Intrusion Detection System Using Genetic Algorithms and Feature Selection", Majlesi Journal of Electrical Engineering, Vol. 4, No. 1, pp. 33-37. [22] Fan Li (2010) Hybrid Neural Network Intrusion Detection System Using Genetic Algorithm, Multimedia Technology (ICMT), International Conference, pp. 1-4. [23] Sadiq Ali M Khan. (2011) Rule based Network Intrusion Detection using Genetic Algorithm, International Journal of Computer Applications, Vol. 18, Np. 8, pp. 26-29, Published by Foundation of Computer Science. [24] Mohammad Sazzadul Hoque, Md. Abdul Mukit, Md. Abu Naser Bikas (2012) An Implementation of Intrusion Detection System using Genetic Algorithm, International Journal of Network Security & Its Applications, Vol. 4, No. 2, pp. 109-120. [25] Alabsi,F. and Naoum,R. (2012) Fitness Function for Genetic Algorithm used in Intrusion Detection System". International Journal of Applied Science and Technology, Vol. 2, No. 4, pp. 129-134. [26] Kshirsagar, Vivek K., Sonali M. Tidke, and Swati Vishnu (2012) Intrusion Detection System using Genetic Algorithm and Data Mining: An Overview., International Journal of Computer Science and Informatics ISSN (PRINT): pp. 2231-5292. [27] Owais, Suhail, Vaclav Snasel, Pavel Kromer, and Ajith Abraham (2008) Survey: using genetic algorithm approach in intrusion detection systems techniques, In IEEE Computer Information Systems and Industrial Management Applications, CISIM'08. 7th, pp. 300-307. [28] Amira Sayed A. Aziz, Mostafa Salama, Aboul ella Hassanien, Sanaa EL-Ola Hanafi (2012) Detectors Generation using Genetic Algorithm for a Negative Selection Inspired Anomaly Network Intrusion Detection System", In proceeding of: IEEE FedCSIS, At Wroclaw, Poland, pp. 625-631, ISBN 978-8360810-51-4. [29] Ole Mengshoel and David E. Goldberg (2008) The Crowding Approach to Niching in Genetic Algorithms, Evolutionary Computation, MIT Press Cambridge, Vol. 16(3), pp. 315-354. [30] Jan Schultz (2008) Minkowski Distance, http://en.wikipedia.org/wiki/Minkowski_distance. [31] John P. Van de Geer (2003) Some Aspects of Minkowski Distances, Department of Data Theory, Leiden University, RR-95-03. [32] NSL-KDD data set, http://nsl.cs.unb.ca/NSL-KDD/ [33] M. Tavallaee, E. Bagheri, W. Lu, A. A. Ghorbani (2009) A detailed analysis of the KDD CUP 99 data set, IEEE Symposium on Computational Intelligence for Security and Defense Applications, CISDA, pp. 1-6. [34] WEKA "Waikato Environment for Knowledge Analysis (weka) version 3.5.9", available on: http://www.cs.waikato.ac.nz/ml/weka/, Junr, 2008. Usage of Holt-Winters Model and Multilayer Perceptron in Network Traffic Modelling and Anomaly Detection Maciej Szmit Orange Labs Poland, 7 Obrzežna Street, 02-691 Warsaw, Poland E-mail: maciej.szmit@gmail.com, http://maciej.szmit.info Anna Szmit Technical University of Lodz, Department of Management, 266 Piotrkowska Street, 90-924 Lodz, Poland E-mail: agorecka@p.lodz.pl, http://anna.szmit.info Slawomir Adamus Technical University of Lodz, Computer Engineering Department, 18/22 Stefanowskiego Street, 90-924 Lodz, Poland AMG.lab, 11 Lakowa Street, 90-562 Lodz, Poland E-mail: slawomir.adamus@hotmail.com Sebastian Bugala Technical University of Lodz, Computer Engineering Department, 18/22 Stefanowskiego Street, 90-924 Lodz, Poland E-mail: sebastian.bugala@hotmail.com Keywords: network behavioral anomaly detection, Holt-Winters model, multilayer perceptron Received: September 16, 2012 This paper presents results of analysis of few kinds of network traffic using Holt-Winters methods and Multilayer Perceptron. It also presents Anomaly Detection - a Snort-based network traffic monitoring tool which implements a few models of traffic prediction. Povzetek: Predstavljena je metoda za modeliranje in iskanje anomalij v omrežju. 1 Introduction In modern computer networks and high-loaded business or industrial systems there is a need of continuous availability of services and hosts (see e.g. [28], [29] [30] [34]). Inaccessibility of some mission critical can cause large impact to business processing continuity and this as a result would generate looses. Solution for such potential problems could be permanent and uninterrupted supervision on network health. This in turn can be achieved by implementation of some monitoring solution. Efficient monitoring method helps achieve high service availability and it will be a good idea to extend network security by tools such as Intrusion Detection System, Intrusion Prevention System and Unified Thread Managers (see e.g. [32] [33]). IDS is a tool which monitors and analyses in real time every aspect of inbound and outbound traffic of the network. Based on the analysis and based on one of the mechanisms responsible for threat detection creates reports of the abnormalities of network traffic. Most common mechanisms which detect threats used in IDS are misuse detection and anomaly detection, they are two different approaches to threat detection, first one relays on determination abnormal parameters and network traffic behavior, everything which we do not know is treated as normal, second one is a reverse of the first one, it treats everything which deviates from the standard is treated as potential threat. IDS on its own only reports and logs the abnormalities and does not take any further actions and his role is to report to administrator which is whom decides what action should be taken to prevent imminent danger which can be a cumbersome for the administrator with a large number of notifications. In order to relieve the amount of work of administrator, ideas of IDS have been extended by possibility to take defined actions immediately in case of detection of typical and schematic threats for the network, as a result IPS was created which is a variety of IDS which is compatible with tools such as firewalls and control its settings in order to counter the threat. A typical representative of the above-described tool is Snort (see e.g. [2] [3] [31]), a software type of IDS/IPS based on mechanism which detects attack signatures originally intended only for the Unix platform, but now also transferred to the Windows operating system, developed on the principles of open source software licenses. Large capacity and performance are characteristics that gained snort popularity among users. Its modular design makes the software very flexible and thus can be easily adapted to the requirements of the currently analyzed network environments, and expand its functionality. This article extends demonstration of the capabilities of the AnomalyDetection tool (basic overview of the tool was published in [15] and [36]) created for network monitoring and future network traffic forecasting Snort-based applications using the flexibility and easy extensibility (the ability to create own preprocessors and postprocessors) of this program. The preprocessor was developed to extends Snorts possibilities of network traffic analysis by anomaly detection mechanism [4]. Combination of the two mechanisms (i.e., misuse detection and anomaly detection) provides more comprehensive protection against all types of threats, even those partially abstract, such as the malice of employees. Tools included in the Anomaly Detection 3.0 allows analysis of movement, its forecasting with help of its advanced statistical algorithms, evaluation of created forecasts, real-time monitoring and verifying that the individual volumes of network traffic parameters do not exceed the forecasted value and in case of exceeding the norms to generate the appropriate messages for the administrator who should check each alarm for potential threats. Current (3.0) version (see e.g. [5], [6]) of AnomalyDetection provides monitoring of following network traffic parameters: total number of TCP, UDP, and ICMP packets, number of outgoing TCP, UDP, and ICMP packets, number of incoming TCP, UDP, and ICMP packets, number of TCP, UDP, and ICMP packets from current subnet, number of TCP packets with SYN/ACK flags, number of outgoing and incoming WWW packets - TCP on port 80, number of outgoing and incoming DNS packets - UDP outgoing on port 53, number of ARP-request and ARP-reply packets, number of non TCP/IP stacks packets, total number of packets, TCP, WWW, UDP, and DNS upload and download speed [kBps]. Whole Anomaly Detection application consists of three parts: Snorts preprocessor, Profile Generator and Profile Evaluator. Data exchange between these parts is realized by CSV (Comma Separated Values) files (see: Figure 1). PATTERN file (theoretical values of time series Figure 1: Anomaly Detection data flow diagram. Source: [15]. Gray solid arrows means saving to file and black dotted -reading from file. Particular files stands for: • Log file - this file gathers all network traffic data collected with AD Snort preprocessor. Data from this file is next used by Profile Generator for network traffic forecasting. • Profile file - this file stores network profile computed with Profile Generator. This file is generated by Profile Generator and used by AD preprocessor for detecting anomalies and generating alerts. After every passed time period preprocessor reads profile file and looks for data corresponding to current period. If value for some counter exceeds minimum (MIN) to maximum (MAX) range then alert is generated. • Predicted pattern file - predicted pattern file contains predicted future data for network - in fact this is the same file as profile file, but with single value for each counter. This is necessary for evaluating profile in AD Evaluator script. Structure of pattern file is the same as log file. • Pattern file - this file is created like predicted pattern file, but network traffic profile stored in this file is historical data. • Parameters file - this file stores information for method of profile generation and method parameters values. This file has different structure for every algorithm of profile generation. • Structures of log and profile files can be found in [15]. Anomaly Detection have two main modes: • data acquisition mode - only network traffic statistics are saved into log file. Only log file is created in this mode. • alerting mode - instead of data acquisition there is also created profile file and current traffic statistics are compared to values stored in profile file. In this mode log and profile file are required. Pattern, predicted pattern and parameters files are always optional and they're useful for future research. Anomaly Detection 3.0 can be downloaded from http://anomalydetection.info [24]. Preprocessor is available as source or RPM package. Both Profile Generator and Evaluator are available as R scripts -additional R CRAN (free) software is required for use R scripts. Additional instalation, update and removal scripts are provided for Profile Generator and Evaluator. 2 Preprocessor The main part of the Anomaly Detection system is a preprocessor written in C programming language, designed to enhance Snort possibilities to monitor, analyze and detect network traffic anomalies using NBAD (Network Behavioral Anomaly Detection) approach. The first version of AnomalyDetection preprocessor [6] for Snort version 2.4x was published in a Master's Thesis [25] in 2006. Next the project has been developed (see e.g. [5] [7] [8] [9] [17]) till the current version 3.0 designed for Snort 2.9.x. The main task of the preprocessor is anomaly detection, realized by using a simple algorithm based on data acquisition and subsequent comparison of the collected values with pattern. Preprocessor reads a predicted pattern of the network traffic (of all parameters) from the 'profile' file and generates alert when the current value exceeds 'minimum' to 'maximum' range for the current moment (the moment is given by day of the week, hour, minute and second corresponding to the intervals from the log file) from the profile file. The profile can be generated 'manually', using external tools, or by a Profile Generator using appropriate model, based on historic values from the log file. The architecture affords easy implementation of different statistical models of the traffic and usage of different tools (i.e. statistical packets) for building profiles. Data from the profile is read in intervals defined by the user, there is only one line read into the structure at a time, this gives possibility to dynamically alter the profile file. In case of failure to find the correct entry in the profile, anomaly report module is automatically disabled to prevent generation of false positive alerts. As mentioned above the current version of the preprocessor can work with adaptive network models through changes in the algorithm which loads profile information. Abandoned single network profile load for the load of single-line in specified time interval. Profile data is loaded at exact time of writing counter to the log file. This solution although increases the number of I/O operations adversely affecting the performance but also supports replacing another model during runtime without having to restart whole application. In addition, all the calculations have been relegated to third-party applications and the profile has been changed so that it contains the minimum and maximum value. This approach makes the preprocessor is more flexible and efficient, does not limit the user to use a single method to generate a network profile, the profile can be freely generated by any application while maintaining only the appropriate input format. Reporting anomalies was adjusted to snort standards by implementing a mechanism which reports events and handle these events by dedicated preprocessor rules. The user can freely adjust the rules to fit his needs, for example; the content of messages stored in the log, which is a priority or which action should be taken when matching rules. These changes make the application more customizable and user-friendly. Improving algorithm for packet acquisition by removing unnecessary comparisons and optimizations of other ones and increased capacity of counters made it possible to use preprocessor in networks with high bandwidth 1Gb and above. The next function of the preprocessor is generating alerts. Preprocessor reads a predicted pattern of the network traffic (of all parameters) from the 'profile' file and generates alert when the current value exceeds 'minimum' to 'maximum' range for the current moment (the moment is given by day of the week, hour, minute and second corresponding to the intervals from the log file) from the profile file. The profile can be generated 'manually', using external tools, or by a Profile Generator using appropriate model, based on historic values from the log file. The architecture affords easy implementation of different statistical models of the traffic and usage of different tools (i.e. statistical packets) for building profiles. Data from the profile is read in intervals defined by the user, there is only one line read into the structure at a time, this gives possibility to dynamically alter the profile file. In case of failure to find the correct entry in the profile, anomaly report module is automatically disabled to prevent generation of false positive alerts. 3 Profile Generator In previous versions of AnomalyDetection system profile generation module was included in preprocesor module -because of this whole application was inflexible. The current version of Profile Generator (see e.g. [7] [8] [9]) have been separated into independent module which can be used to compute statistical models not only for AD preprocessor. Furthermore current version is based on R language / environment (The R Project for Statistical Computing) (see e.g. [10] [11] [12] [13] [14]) which is more flexible and user-friendly than previous implementation in C language. R-project is an free, open source packet for statistical computing and graphics. In this implementation optional packages for R: tseries, quadprog, zoo and getopt are used. The whole implementation of Profile Generator is divided into few parts. First part prepares data from log file for further calculations and other parts - depending on the given parameters - calculates future network traffic forecasts. At the end all computed values are written into proper files - based on given runtime parameters. Data flow in ProfileGenerator module is shown on Figure 2. Figure 2: Profile Generator data flow diagram. Source: [35]. Profile Generator is controlled with parameters passed for script execution - all script parameters are handled with getopt() function. Particular columns of specification matrix contains respectively: • long flag name • short flag • parameters arguments • arguments type description Profile Generator actually implements five methods of profile file generation: moving average, naive method, autoregressive time series model, Holt-Winters model and Brutlags version of HW model (see e.g. [1] [17]). The value of dependent variable is given as follows: Moving average: t-1 y = S y i=t -k Naive method: yt = yt-T where T is day or week period, or yt = yt-1 Autoregressive time series model: yt = a0 + a1 yt-1 + a2 yt-2 +...+akyt-k Holt-Winters model: yt = L-1 + P-1 + st-T where: L is level component given by: Lt = a( yt - s t -T) +(1 - a)(Lt-1 + P-1) P is trend component given by: Pt = ß(Lt - Lt-1) + (1 - ß)Pt-1 s is seasonal component given by: St = Y(yt - Lt) + (1 - y)S, t-T (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Brutlag method: jpr = Lt-1 + Pt-1 + St-t + mdt -t yr = Lt-1 + Pt-1 + st-T - mdt -t where: L , P and S are the same as in Holt-Winters model d is predicted deviation given by: d = A yt- y J +(1 - r)dt-1 where: k is number of measurements in time series t is moment in time y is predicted value of variable in moment t y t is real (measured) value of variable in (11) moment 1 T is time series period a is data smoothing factor ß is trend smoothing factor Y is the seasonal change smoothing factor m is the scaling factor for Brutlags confidence bands 4 Implementation of Naive Method Naive method is the simplest method implemented in Profile Generator module. For computing profile with this method PG must be launched with '-m NAIVE' parameter. Additional '--naive' parameter can be used for defining detailed method 'periodicity'. Method implement three version of naive prediction - LAST, DAILY and WEEKLY. For LAST version forecasted data are defined as the same as previous measurement. DAILY version means that predicted values for some day would be the same as values in previous day of given time-series. The last version stand for algorithm in which forecasted values are determined based on logged data for the same day-of-week in previous week. Because of simplicity if this method it should be used only in adaptive startup mode - this will cause less false-positive alerts and more dynamically prediction. In this mode profile is recalculated in regular intervals of time, so predicted values refreshes with every oncoming period of counter values registration. Figure 3 shows graph with predicted values with 5 period interval of method recalculation. It can be observed step changes of predicted values in succeeding periods. Y-axis on Fig 5 stands for minimal and maximal border of permitted values for total number of TCP packets. X-axis stands for sample number in forecasted time-series Figure 3: Naive method running in adaptive mode with 5 period interval of recalculation. Source: [35]. 5 Implementation of Moving Average Method Moving average method is computed when Profile Generator is run with '-m AVG' parameter set. Detailed method periodicity and length of the horizon of values used for calculation can be defined with '--avg' parameter. Similar to the naive method - there are three versions of periodicity: LAST, DAILY and WEEKLY. There is also required second parameter which stands for number of values used to compute moving average. For example 'DAILY,3' means that values from three previous days would be used to compute prediction, 'LAST,5' means that average would be computed using five previous values registered in log file. 6 Implementation of Autoregressive Model AR model can be calculated when run with '-m AR' parameter. Calculations in this method are based on ar() function from package stats in R environment. Function ar() fits an autoregressive time series model to given data and it is wrapper for the functions: ar.yw, ar.burg, ar.ols and ar.mle. Setting 'method' parameter to ar() function defines the method used to fit the model. There are available four algorithms used to fit model to given time-series: Yule-Walkers, Burgs, MLE (maximum likelihood) and OLS (ordinary least squares). 7 Implementation of Holt-Winters Model The Holt-Winters model, called also the triple exponential smoothing model, is a well-known adaptive model used to modeling time series characterized by trend and seasonality (see e.g. [20], [19] p. 248, [18], [21], [22]). The model is sometimes used to modeling and prediction of network traffic (see e.g. [23],[7], [8]). For computing an Holt-Winters model Profile Generator must be launched with parameter '-m HW'. Optional parameter '--hw' can be set for defining model periodicity and subset of data used to build model. Implementation of Holt-Winters prediction method in Profile Generator is based on function HoltWinters() from package stats. HoltWinters() functions requires time series data as object of class 'ts' (time-series object). Object 'ts' is created as follows: ts obj<- ts(log.data[,column.log], frequency=pr ofile.config.frequency, start=c(as.num eric(log.first.date),log.first.sample. no)) Function 'ts' gets in this implementation 3 parameters: • data - a numeric vector of the observed time-series values • frequency - the number of observations per unit of time • start - the number of observations per unit of time. This parameter can be a single number or a vector of two integers - because of this in our implementation human-readable date from log file is converted into numeric value and second value is number of sample of first observation in the day. Next HoltWinters() function computes Holt-Winters filtering of a given time series. Function tries to find the optimal values of a or ß or y by minimizing the squared one-step prediction error with optim() function. Start values for L , P and ^ are inferred by performing a simple decomposition in trend and seasonal component using moving averages - it is realized with decompose () function. Figure 4 shows one weekly period (from January 1st to January 7th) of testing data. Figure 4: One period of testing data. Source: own research. Decompose () function decomposes a time series into seasonal, trend and irregular components using moving averages. For testing data decompose () function returns values with trend, seasonal and random component. Figure 5 shows those decomposed data. Decomposition of additive time series lW //-a a j\ J\> r \f\>l\[ ^ V^V'/ v./ \ ^A A c'A \ / \ (n \ / \ <# \ / \ Vi V W \t) Vi/ A / /V Figure 5: Decomposed time series. Source: own research. HoltWinters() function estimates HW model smoothing parameters (alpha, beta and gamma), which were for testing data as follows (see: Figure 6). Figure 7 shows Holt-Winters fitted to observed comparison. Fitted values with HoltWintersQ function V j vvvA^ A ^ .A M M M M JV s/ V V V v v Figure 6: Fitted Holt-Winters. Alpha=0.8140128; beta=0; gamma=1. Source: own research. Figure 8: Holt-Winters prediction. Source: own research. 8 Brutlags Algorihm Holt-Winters method was used to detect network traffic anomalies as described in the article [1]. In that paper, the concept of "confidence bands" was introduced. As described in the article, confidence bands measure deviation for each time point in the seasonal cycle and this mechanism bases on expected seasonal variability. Illustration Fig 9 shows computed confidence bands for HW time series prediction. Figure 7: Holt-Winters fitted to observed comparison. Source: own research. Fitted values compared to observed values for given testing data: Black line stands for observed data and gray line stands for fitted model (in most range black line covers gray). When Holt-Winters model is computed, then future prediction can be calculated simple with predict.HoltWinters() function. Predict() function takes in this case two arguments: HoltWinters object with fitted model parameters number of future periods to predict Function returns a time series of the predicted values for given future periods. For testing data values returned from predi et () function are shown on Figure 8. Figure 9: Brutlags confidence bands. Source: own research. Confidence band is computed by comparing last period of collected network traffic values with fitted Holt-Winters values for the same period. Subtract of real and predicted values is next scaled with Y estimated by Holt-Winters function - obtained value is finally multiplied by scaling factor. Confidence band width is controlled with '--scale' parameter - above example is computed with scale parameter value of '2'. Brutlag proposes sensible values of '--scale' parameter are between 2 and 3. Particular lines stands for: • black - observed values of time series • medium gray - computed prediction of time series with Holt-Winters model • light gray - upper bound of Brutlags confidence band • gray - lower bound of Brutlags confidence band 9 Usage of Profile Generator Generator can be launched like any script in CLI (Command Line Interface) of operating system with R software and necessary packages installed. Scripts available at [24] were tested on few GNU / Linux distributions: Fedora, Oracle Linux, CentOS, Debian, and Ubuntu. Parameters for Profile Generator script are validated against bellow BNF notation grammar: ad profilegenerator.r | ahead> ::= -(-log|l) <> ::= -(-profile|p) <> | <> ::= -(-evaluator|e) <> | <> ::= -(-pattern|P) «pattern file path>> | <> ::= -(-save|s) <> | <> > | < ::= - <> ::= AVG | NAIVE | AR | HW | BRUTLAG ::= —avg | <> ::= -(-verbose|v) | ■ ::= -(-ahead|a) <> ::= WEEK|MONTH| ::= -(-scale|d) <> -method|m) | -naive > --hw > --brutlag | <> : ::= :: | <> (DAILY|WEEKLY),(YW|BURG|MLE|OLE) ::= (DAILY|WEEKLY) ::= (DAILY|WEEKLY) ::= |0|1|2|3|4|5|6|7|8|9 Sense of each parameter impact is clarified under '-help' parameter. At least one of ,,, or parameter should be set for any sense of running script. For example the simplest naive prediction for real data stored in 'log.csv' file with saving profile data to 'profile.csv' file can be launched with: ./ad profilegenerator.r -l log.csv -p profile.csv -m NAIVE —naive LAST Prediction for one week for the same file based on Holt-Winters algorithm with daily periodicity and with 'verbose' mode can be calculated with: ./ad_profilegenerator.r -l log.csv -p profile.csv -m HW --hw DAILY -ahead WEEK -v 10 Evaluator Profile Evaluator is' the third part of Anomaly Detection project. This script is designed for fast evaluation of profile file compared to log file. This script calculates MAE simple statistic -for two files. Main application of M Evaluator is to check fit between pattern and current logged values (with log and pattern file) or between model and historical data (log and predicted pattern file). MAE means Mean Absolute Error and M means Mean. 1 n 1 n Mae=- yt - yt\=- 2 hi n t=1 n = 1 m = - 2 y n t=1 (12) (13) where: yt is real (current) value of counter in moment yt t ' is predicted (estimated) value of counter in moment t et is prediction error in moment t Calculated values for each counter can be stored in output file when '-s' parameter is set. Exemplary comparison of real registered values with its prediction is shown on Fig 10. Figure 10: Real values compared to AVG - DAILY,3 prediction. Source: [35]. Profile Evaluator script is launched likewise Profile Generator script. Profile Evaluator script parameters grammar looks as follows: ad evaluator.r ::= | ::= -(-help|h) ::= ::= -(-log|l) <> ::= -(-pattern|p) <> ::= -(-save|s) <> <> ::= -(-skip|S) | <> ::= -(-verbose|v) | <> Evaluation of pattern stored in 'pattern.csv' file compared with log data stored in 'log.csv' file can be done with: ./ad evaluator.r -l log.csv -p pattern.csv --verbose 11 Multilayer Perceptron All our previous models can be classified as statistical model assigned to one of two groups: Time Series Models and descriptive models. The next step is usage of artificial-intelligence methods, particularly Artificial Neural Networks (ANN) which are implemented only as offline models in the current state of our research. Artificial Neural Networks are the mathematical models inspired by biological neural networks. ANN consist of an interconnected group of artificial neurons operating in parallel. ANN function is determined by the weights of the connections between neurons, which usually change during a learning phase. There are a lot of types and architectures of ANN according on their purpose. Because of the nature of IDS there are two main groups of issues: pattern recognition, especially classification and prediction. These issues correspond with two main areas of application of ANN. In consequence ANN can be used for intrusion detection in two main ways: as a classifier which determine whether a given object (for example: network packet, e-mail, network flow) is normal or suspicious and as a predictor which try to forecast a future values of system parameters (for example: network traffic, CPU utilization, number of network connections). There are a lot of publications about usage different types of ANN for network traffic prediction (See e.g.: [22], [23], [24], [25]) or intruder detection (See e.g.: [19], [20], [21]). In our current research we choose the simplest artificial neural network - Multilayer Perceptron (MLP) for prediction of traffic time series values. An MLP is a network of neuron called perceptrons. The perceptron is a binary classifier which compute a single output from multiple inputs (and the 'bias', a constant term that does not depend on any input value) as function of its weighted sum. y =