Zbornik 17. mednarodne multikonference INFORMACIJSKA DRUŽBA – IS 2014 Zvezek A Proceedings of the 17th International Multiconference INFORMATION SOCIETY – IS 2014 Volume A Inteligentni sistemi Intelligent Systems Uredila / Edited by Rok Piltaver, Matjaž Gams http://is.ijs.si 7.−8. oktober 2014 / October 7th−8th, 2014 Ljubljana, Slovenia Zbornik 17. mednarodne multikonference INFORMACIJSKA DRUŽBA – IS 2014 Zvezek A Proceedings of the 17th International Multiconference INFORMATION SOCIETY – IS 2014 Volume A Inteligentni sistemi Intelligent Systems Uredila / Edited by Rok Piltaver, Matjaž Gams http://is.ijs.si 7. - 8. oktober 2014 / October 7th - 8th, 2014 Ljubljana, Slovenia Urednika: Rok Piltaver Odsek za inteligentne sisteme Institut »Jožef Stefan«, Ljubljana Matjaž Gams Odsek za inteligentne sisteme Institut »Jožef Stefan«, Ljubljana Založnik: Institut »Jožef Stefan«, Ljubljana Priprava zbornika: Mitja Lasič, Vesna Lasič, Lana Zemljak Oblikovanje naslovnice: Vesna Lasič, Mitja Lasič Ljubljana, oktober 2014 CIP - Kataložni zapis o publikaciji Narodna in univerzitetna knjižnica, Ljubljana 004.89(082)(0.034.2) MEDNARODNA multikonferenca Informacijska družba (17 ; 2014 ; Ljubljana) Inteligentni sistemi [Elektronski vir] : zbornik 17. mednarodne multikonference - IS 2014, 7-8 oktober 2014, Ljubljana, Slovenija : zvezek A = Intelligent systems : proceedings of the 17th International Multiconference Information Society - IS 2014, October 7th-8th, 2014, Ljubljana, Slovenia : volume A / uredila/edited by Rok Piltaver, Matjaž Gams. - El. knjiga. - Ljubljana : Institut Jožef Stefan, 2014 Način dostopa (URL): http://library.ijs.si/Stacks/Proceedings/InformationSociety ISBN 978-961-264-071-2 (pdf) 1. Gl. stv. nasl. 2. Vzp. stv. nasl. 3. Dodat. nasl. 4. Piltaver, Rok 27986727 PREDGOVOR MULTIKONFERENCI INFORMACIJSKA DRUŽBA 2014 Multikonferenca Informacijska družba (http://is.ijs.si) s sedemnajsto zaporedno prireditvijo postaja tradicionalna kvalitetna srednjeevropska konferenca na področju informacijske družbe, računalništva in informatike. Informacijska družba, znanje in umetna inteligenca se razvijajo čedalje hitreje. Čedalje več pokazateljev kaže, da prehajamo v naslednje civilizacijsko obdobje. Npr. v nekaterih državah je dovoljena samostojna vožnja inteligentnih avtomobilov, na trgu pa je moč dobiti kar nekaj pogosto prodajanih tipov avtomobilov z avtonomnimi funkcijami kot »lane assist«. Hkrati pa so konflikti sodobne družbe čedalje bolj nerazumljivi. Letos smo v multikonferenco povezali dvanajst odličnih neodvisnih konferenc in delavnic. Predstavljenih bo okoli 200 referatov, prireditev bodo spremljale okrogle mize, razprave ter posebni dogodki kot svečana podelitev nagrad. Referati so objavljeni v zbornikih multikonference, izbrani prispevki bodo izšli tudi v posebnih številkah dveh znanstvenih revij, od katerih je ena Informatica, ki se ponaša s 37-letno tradicijo odlične evropske znanstvene revije. Multikonferenco Informacijska družba 2014 sestavljajo naslednje samostojne konference: • Inteligentni sistemi • Izkopavanje znanja in podatkovna skladišča • Sodelovanje, programska oprema in storitve v informacijski družbi • Soočanje z demografskimi izzivi • Vzgoja in izobraževanje v informacijski družbi • Kognitivna znanost • Robotika • Jezikovne tehnologije • Interakcija človek-računalnik v informacijski družbi • Prva študentska konferenca s področja računalništva • Okolijska ergonomija in fiziologija • Delavnica Chiron. Soorganizatorji in podporniki konference so različne raziskovalne in pedagoške institucije in združenja, med njimi tudi ACM Slovenija, SLAIS in IAS. V imenu organizatorjev konference se želimo posebej zahvaliti udeležencem za njihove dragocene prispevke in priložnost, da z nami delijo svoje izkušnje o informacijski družbi. Zahvaljujemo se tudi recenzentom za njihovo pomoč pri recenziranju. V 2014 bomo drugič podelili nagrado za življenjske dosežke v čast Donalda Michija in Alana Turinga. Nagrado Michie-Turing za izjemen življenjski prispevek k razvoju in promociji informacijske družbe je prejel prof. dr. Janez Grad. Priznanje za dosežek leta je pripadlo dr. Janezu Demšarju. V letu 2014 četrtič podeljujemo nagrado »informacijska limona« in »informacijska jagoda« za najbolj (ne)uspešne poteze v zvezi z informacijsko družbo. Limono je dobila nerodna izvedba piškotkov, jagodo pa Google Street view, ker je končno posnel Slovenijo. Čestitke nagrajencem! Niko Zimic, predsednik programskega odbora Matjaž Gams, predsednik organizacijskega odbora i FOREWORD - INFORMATION SOCIETY 2014 The Information Society Multiconference (http://is.ijs.si) has become one of the traditional leading conferences in Central Europe devoted to information society. In its 17th year, we deliver a broad range of topics in the open academic environment fostering new ideas which makes our event unique among similar conferences, promoting key visions in interactive, innovative ways. As knowledge progresses even faster, it seems that we are indeed approaching a new civilization era. For example, several countries allow autonomous card driving, and several car models enable autonomous functions such as “lane assist”. At the same time, however, it is hard to understand growing conflicts in the human civilization. The Multiconference is running in parallel sessions with 200 presentations of scientific papers, presented in twelve independent events. The papers are published in the Web conference proceedings, and a selection of them in special issues of two journals. One of them is Informatica with its 37 years of tradition in excellent research publications. The Information Society 2014 Multiconference consists of the following conferences and workshops: • Intelligent Systems • Cognitive Science • Data Mining and Data Warehouses • Collaboration, Software and Services in Information Society • Demographic Challenges • Robotics • Language Technologies • Human-Computer Interaction in Information Society • Education in Information Society • 1st Student Computer Science Research Conference • Environmental Ergonomics and Psysiology • Chiron Workshop. The Multiconference is co-organized and supported by several major research institutions and societies, among them ACM Slovenia, SLAIS and IAS. In 2014, the award for life-long outstanding contributions was delivered in memory of Donald Michie and Alan Turing for a second consecutive year. The Programme and Organizing Committees decided to award the Prof. Dr. Janez Grad with the Michie-Turing Award. In addition, a reward for current achievements was pronounced to Prof. Dr. Janez Demšar. The information strawberry is pronounced to Google street view for incorporating Slovenia, while the information lemon goes to cookies for awkward introduction. Congratulations! On behalf of the conference organizers we would like to thank all participants for their valuable contribution and their interest in this event, and particularly the reviewers for their thorough reviews. Niko Zimic, Programme Committee Chair Matjaž Gams, Organizing Committee Chair ii KONFERENČNI ODBORI CONFERENCE COMMITTEES International Programme Committee Organizing Committee Vladimir Bajic, South Africa Matjaž Gams, chair Heiner Benking, Germany Mitja Luštrek Se Woo Cheon, Korea Lana Zemljak Howie Firth, UK Vesna Koricki-Špetič Olga S. Fomichova, Russia Mitja Lasič Vladimir A. Fomichov, Russia Robert Blatnik Vesna Hljuz Dobric, Croatia Mario Konecki Alfred Inselberg, Izrael Vedrana Vidulin Jay Liebowitz, USA Huan Liu, Singapore Henz Martin, Germany Marcin Paprzycki, USA Karl Pribram, USA Claude Sammut, Australia Jiri Wiedermann, Czech Republic Xindong Wu, USA Yiming Ye, USA Ning Zhong, USA Wray Buntine, Finland Bezalel Gavish, USA Gal A. Kaminka, Israel Mike Bain, Australia Michela Milano, Italy Derong Liu, Chicago, USA Toby Walsh, Australia Programme Committee Nikolaj Zimic, chair Matjaž Gams Ivan Rozman Franc Solina, co-chair Marko Grobelnik Niko Schlamberger Viljan Mahnič, co-chair Nikola Guid Stanko Strmčnik Cene Bavec, co-chair Marjan Heričko Jurij Šilc Tomaž Kalin, co-chair Borka Jerman Blažič Džonova Jurij Tasič Jozsef Györkös, co-chair Gorazd Kandus Denis Trček Tadej Bajd Urban Kordeš Andrej Ule Jaroslav Berce Marjan Krisper Tanja Urbančič Mojca Bernik Andrej Kuščer Boštjan Vilfan Marko Bohanec Jadran Lenarčič Baldomir Zajc Ivan Bratko Borut Likar Blaž Zupan Andrej Brodnik Janez Malačič Boris Žemva Dušan Caf Olga Markič Leon Žlajpah Saša Divjak Dunja Mladenič Igor Mekjavić Tomaž Erjavec Franc Novak Tadej Debevec Bogdan Filipič Vladislav Rajkovič Andrej Gams Grega Repovš iii iv KAZALO / TABLE OF CONTENTS Inteligentni sistemi / Intelligent Systems ................................................................................................................. 1 PREDGOVOR / FOREWORD ................................................................................................................................. 3 Multiobjective Optimisation of Water Heater Scheduling / Brence Jure, Gosar Žiga, Seražin Vid, Zupančič Jernej, Gams Matjaž ........................................................................................................................................... 5 Analiza nakupov in modeliranje pospeševanja prodaje v spletni trgovini mercator / Černe Matija, Kaluža Boštjan, Luštrek Mitja ......................................................................................................................................... 9 Analiza možnosti zaznavanja podobnosti med uporabniki / Cvetković Božidara, Luštrek Mitja .......................... 14 Visualization of Explanations of Incremental Models / Demšar Jaka, Bosnić Zoran, Kononenko Igor ................ 18 Detection of Irregularities on Automotive Semiproducts / Dovgan Erik, Gantar Klemen, Koblar Valentin, Filipič Bogdan ................................................................................................................................................... 22 An Elderly-Care System Based on Sound Analysis / Frešer Martin, Košir Igor, Mirchevska Violeta, Luštrek Mitja ..................................................................................................................................................... 26 Are Humans Getting Smarter due to AI? / Gams Matjaž ..................................................................................... 30 Developing a Sensor Firmware Application for Real-Life Usage / Gjoreski Hristijan, Luštrek Mitja, Gams Matjaž ............................................................................................................................................................... 34 Automatic Recognition of Emotions From Speech / Gjoreski Martin, Gjoreski Hristijan, Kulakov Andrea .......... 38 Qualcomm Tricorder Xprize Final Round: A Review / Gradišek Anton, Somrak Maja, Luštrek Mitja, Gams Matjaž ............................................................................................................................................................... 42 Avtomatizacija izgradnje baze odgovorov virtualnega asistenta za občine in društva / Jovan Leon Noe, Nikič Svetlana, Kužnar Damjan, Gams Matjaž ................................................................................................ 46 Inferring Models for Subsystems Based on Real World Traces / Kerkhoff Rutger, Tavčar Aleš, Kaluža Boštjan .............................................................................................................................................................. 50 Inclusion of Visual y Impaired in Graphical User Interface Design / Konecki Mario ............................................. 54 Mining Telemonitoring Data from Congestive-Heart-Failure Patients / Luštrek Mitja, Somrak Maja ................... 58 Approximating Dex Utility Functions With Methods UTA And ACUTA / Mihelčić Matej, Bohanec Marko ........... 62 Comparing Random Forest and Gaussian Process Modeling in the GP-Demo Algorithm / Mlakar Miha, Tušar Tea, Filipič Bogdan ................................................................................................................................ 66 Comprehensibility of Classification Trees – Survey Design / Piltaver Rok, Luštrek Mitja, Gams Matjaž, Martinčič Ipšić Sandra ...................................................................................................................................... 70 Pametno vodenje sistemov v stavbah s strojnim učenjem in večkriterijsko optimizacijo / Piltaver Rok, Tušar Tea, Tavčar Aleš, Ambrožič Nejc, Šef Tomaž, Gams Matjaž, Filipič Bogdan ....................................... 74 Determination of Classification Parameters of Barley Seeds Mixed with Wheat Seeds by Using ANN / Sabancı Kadir, Aydın Cevat ............................................................................................................................. 78 Novi Govorec: naravno zveneč korpusni sintetizator slovenskega govora / Šef Tomaž ..................................... 81 Cloud-Based Recommendation System for E-Commerce / Slapničar Gašper, Kaluža Boštjan .......................... 85 Novel Image Processing Method in Entomology / Tashkoski Martin, Madevska Bogdanova Ana ...................... 89 Arhitektura sistema OpUS / Tavčar Aleš, Šorn Jure, Tušar Tea, Šef Tomaž, Gams Matjaž ............................... 93 Predictive Process-Based Modeling of Aquatic Ecosystems / Vidmar Nina, Simidjievski Nikola, Džeroski Sašo.................................................................................................................................................................. 97 Recognition of Bumblebee Species by their Buzzing Sound / Yusupov Mukhiddin, Luštrek Mitja, Grad Janez, Gams Matjaž ....................................................................................................................................... 102 The Las Vegas Method of Paral elization / Zavalnij Bogdan .............................................................................. 105 Resource-Demand Management in Smart City / Zupančič Jernej, Kužnar Damjan, Kaluža Boštjan, Gams Matjaž ............................................................................................................................................................. 109 Indeks avtorjev / Author index .............................................................................................................................. 113 v vi Zbornik 17. mednarodne multikonference INFORMACIJSKA DRUŽBA – IS 2014 Zvezek A Proceedings of the 17th International Multiconference INFORMATION SOCIETY – IS 2014 Volume A Inteligentni sistemi Intelligent Systems Uredila / Edited by Rok Piltaver, Matjaž Gams http://is.ijs.si 7. - 8. oktober 2014 / October 7th - 8th, 2014 Ljubljana, Slovenia 1 2 PREDGOVOR Konferenca Inteligentni sistemi je od leta 1997 naprej sestavni del multikonference Informacijska družba. Poglavitne teme so inteligentni sistemi in inteligentne storitve informacijske družbe, oz. programski sistemi informacijske družbe, tehnične rešitve v inteligentnih sistemih, možnosti njihove praktične uporabe, pa tudi trendi, perspektive, nujni ukrepi, prednosti in slabosti, priložnosti in nevarnosti, ki jih v informacijsko družbo prinašajo inteligentni sistemi. V 2014 se nesluten razvoj informacijske družbe in zlasti umetne inteligence nadaljuje s čedalje hitrejšim tempom. V nekaterih državah po svetu že vozijo avtonomni avtomobili, npr. »Google car«. Nekoč utopične ideje Raya Kurtzweila o točki singularnosti in preskoku v novo človeško ero, se tako zdijo čedalje bližje. Hkrati pa se razlike med ljudmi povečujejo in nihče prav dobro ne razume družbenih sprememb, ki smo jim priča. Tudi letos konferenca Inteligentni sistemi sestoji iz mednarodnega dela in delavnice; prispevki so tako v slovenskem kot angleškem jeziku. Sprejetih je več kot 25 prispevkov, ki so bili recenzirani s strani vsaj dveh anonimnih recenzentov, avtorji pa so jih popravili po navodilih recenzentov. Večina prispevkov obravnava raziskovalne dosežke Odseka za inteligentne sisteme Instituta »Jožef Stefan«. Hkrati s predstavitvijo poteka tudi aktivna analiza prispevkov vsakega predavatelja in diskusija o bodočih raziskavah. Rok Piltaver in Matjaž Gams, predsednika konference PREFACE The Intelligent Systems conference remains one of the fundamental parts of the multiconference Information Society since its beginnings in 1997. The conference addresses important aspects of information society: intelligent computer-based systems and the corresponding intelligent services, technical aspects of intelligent systems, their practical applications, as well as trends, perspectives, advantage and disadvantages, opportunities and threats that are being brought by intelligent systems into the information society. As a trend, the progress in information society and intelligent systems increases further in recent years. For example, some countries already enacted autonomous car driving. Once regarded as utopist, the ideas of Ray Kurtzweil that the human civilization will embrace a new, intelligent era, are becoming widely accepted. At the same time, it seems that nobody fully understands the emerging changes in human society. The conference consists of an international event and a workshop, and presents over 25 papers written in both English and Slovenian language. The papers have been reviewed by at least two anonymous reviewers and the authors have modified their papers according to the remarks. Papers from the Jozef Stefan Institute - Department of Intelligent Systems are presented separately. Each presentation consists of the classical paper report, and further includes analysis of researcher’s achievements and future research plans of each presenter. Rok Piltaver and Matjaž Gams, Conference Chairs 3 4 MULTIOBJECTIVE OPTIMISATION OF WATER HEATER SCHEDULING Jure Brence1∗, Žiga Gosar1∗, Vid Seražin1∗, Jernej Zupančič2, Matjaž Gams2 1Faculty of mathematics and physics, University of Ljubljana, Jadranska ulica 19, 1000 Ljubljana 2Department of Intelligent Systems, Jozef Stefan Institute, Jamova cesta 39, 1000 Ljubljana e-mail: {jure.brence, ziga.gosar, vid.serazin}@student.fmf.uni-lj.si, {jernej.zupancic, matjaz.gams}@ijs.si ABSTRACT is presented, which addresses the extraction of household wa- ter usage patterns with the goal of peak-shaving and reducing In this paper we present our work on the optimisation the load on the power-grid. In [3] similar goals are addressed, of water heater scheduling. The goal is to develop intelli- while approaching the problem from a different angle, util- gent strategies for controlling the electric heater and heat ising fuzzy logic to control electric water heaters. Similarly, pump in commercial combined water heaters. Strate- in [4] the focus is on a solution that decreases peak load on gies try to find the best compromise between comfort and the grid by scheduling heating outside peak hours. In [5] a price, based only on information about the temperature of simulation platform to model electric water heaters and test water in the reservoir. A simulation and testing environ- demand response control strategies in a smart grid is intro- ment has been implemented to compare the performance duced. of existing and new strategies. 3 THE PROBLEM 1 INTRODUCTION The aim is to develop intelligent strategies for the scheduling Hot water heating is the biggest component of electricity con- of water heating. There are several types of water heaters on sumption in residential homes, contributing as much as 20% the market, the difference being their source of energy. The to the total electricity consumption in an average Slovenian most interesting are combined water heaters that have both an household [11]. Water heater manufacturers continually de- electric heater and a heat pump at its disposal. The control velop improvements to the mechanical aspects of water heat- unit of a combined water heater is able to control the different ing. However, the potential for savings by smarter power heaters separately. At any given moment the controller de- scheduling is quite unexplored. Most water heater controllers cides whether a heater is to be turned off or on. Water heaters tend to keep water temperature at pre-set levels throughout typically have a single thermometer installed, usually on the the day, with the exception of user-defined schedules. This top of the water reservoir. This measurement is the only in- results in increased heat loss and, more importantly, bigger formation an intelligent controller gets about the state of the loads on the power grid during peak hours. An intelligent water in the reservoir and the consumption habits of the users. controller would be able to find and optimised schedule of The development of intelligent strategies is a multi- water heating, customised for the habits and wishes of users. objective optimisation problem. The first objective is the elec- It is important not only to minimise the price of heating, but tricity cost and the second is some measure of discomfort of to do so with a minimal increase in user discomfort level. the users. Any strategy will have to be a trade-off between the two. Our solution will be a set of strategies, among which the 2 RELATED WORK user will be able to choose the one with the desired trade-off between price and comfort. Some research on the topic of electric water heaters has al- ready been done. All stated sources are dealing with devices using only an electric heater, whereas our research focuses on 4 STRATEGIES combined devices. Much of existing work perceives user dis- We have implemented a number of different strategies. Each comfort as a constraint, rarely incorporating it as one of the falls into one of two categories that differ by the information objectives. that is available to the controller. In [1] solutions are provided for an electric water heater The simplest strategies are static strategies that use only that is connected to an electrical grid where the electricity tar- predefined settings and current measurements. These could iff is dynamically changed in real time, and mainly focuses be date, time, temperature and the temperature in the previ- on optimisation in regard to this tariff system. In [2] a model ous minute. Static strategies follow a predefined set of rules. ∗These authors equally contributed to the paper. While they do not learn or modify their behaviour, different 5 rule-sets may be defined for different periods of the day, or achieved with the heat pump alone, Bulk begins utilis- days in the week. Some static strategies: ing the electric water heater. This way, any heating is done directly before water consumption. It can also be 1. On-Off Control (lower T, upper T, electric heater, heat modified to heat during the lower price tariff to accumu- pump) is the strategy used in most commercial water late heat. This way Bulk produces a result with minimal heaters. Sometimes called Bang-Bang Control. The discomfort at an almost minimal price. boolean constants electric heater and heat pump spec- ify if the strategy is allowed to use electric heater and There is also a third category of strategies that learn from heat pump. When the temperature drops below lowerT the past and adjust their decision-making to best fit the user all available heat sources are turned on until upperT tem- habits. This kind of strategies are the final goal of our re- perature is reached. search. 2. Intervals (list of intervals with appropriate strategies) 5 METHODS uses different strategies in different parts of the day (e.g. when electricity is cheaper or when the user ex- The basic method applied in this research is the testing and pects higher water consumption). At initialisation we comparison of various scheduling strategies. We utilise com- can specify any number of intervals and corresponding puter simulations, as running these tests on real water heaters strategies. One example is a sub-strategy called Heat would require a lot of time and resources, which we do not Less at Noon which uses On-Off Control(40, 41, False, have at our disposal. To this purpose, we have developed a True) between 9 am and 3 pm and On-Off Control(45, water heater simulator and a water consumption simulator. 50, False, True) at other times. This strategy is similar up to constants values to some real strategies consumers 5.1 Water heater simulation use. Real specifications [9, 10] of commercial water heaters were 3. New On-Off (lower boundary for electric heater, lower used, namely: dimensions of the reservoir, power of heaters, boundary for heat pump). When the temperature drops coefficient of performance (COP) of the heat pump, thermal below the predefined lower boundary for electric heater conductivity of the insulation and maximum flow rate. Typ- the electric heater turns on until the temperature is higher ical water heaters are shaped cylindrically, with cold water than this boundary. The heat pump works on the same entering the reservoir at the bottom and hot water leaving on principle but with a different boundary temperature, the top. The position of the heating element varies with the which is usually higher. model. Some manufacturers choose to position the heater at the bottom, to encourage the convection of hot water, oth- 4. Rules Z. A day is divided in N regions that are set by ers attempt to heat uniformly along the vertical axis, or some the user. In each region a set of boundary temperatures, other option. In current tests water is heated uniformly. Com- as well as boundary temperature changes is defined. The bined water heaters have two types of heaters: electric heater two different heaters are turned on or off based on the and heat pump. With the electric heater, the thermal power boundary conditions for the current region. it produces is equal or close to equal to the electric power it consumes. As such, its heating power is fixed. The heat Oracle strategies are given the future water consumption pump, on the other hand, produces more thermal energy than schedule that they use to calculate the plan of how and when the amount of electric energy it uses. The ratio between the they will heat the water. We use these strategies to get the two – COP – typically falls into a range from 2 to 5.5. The best trade-off between discomfort and price. There is no other COP of a heat pump depends on the temperature of the heat strategy with a strictly better performance in both objectives. source, often the outside air, and the temperature of water. 1. Brute Force makes decisions at discrete time intervals During the heating process, as water temperature increases, of predefined length, usually 1 or 10 minutes. At every the COP drops. step four options are available: no heating, only electric The simulation does not attempt to simulate the complex heater, only heat pump, or both. Brute force simulates thermodynamics and fluid mechanics happening in the wa- every possibility, looking for the optimal one. Theoret- ter heater. It rather uses a simplified model that manages to ically every possible strategy would be tested by Brute emulate the responses of the built-in thermometer to various Force, allowing us to find the true Pareto front. This inputs. The water in the reservoir is divided into 20 layers approach is not practical due to its computational ineffi- along the vertical axis. All the water in one layer has the ciency. same temperature. The water heater is simulated with a one minute step. Each step, energy losses are calculated for each 2. Bulk starts with a decision to never heat. It simulates layer, taking into account the water temperature, outside tem- the water heater until it reaches discomfort. Then it perature and the thermal conductivity of the container walls. starts rewinding time back and turning the heat pump Heat exchange between neighbouring layers is simulated with on until the discomfort reaches zero. If this cannot be an experimentally set heat transfer coefficient to match real 6 data. When heating is turned on, each layer receives its share corresponding temperature. The total discomfort of a mea- of thermal energy. As the simulator receives a request for surement is defined as the sum of individual discomforts. hot water, it removes the appropriate volume of water from The calculations of price have to take into account different the top layers and adds cold water layers at the bottom. The price tariffs. The majority of Slovenian electricity providers number of layers and their individual size is kept in check by use a two-tariff system, with the lower tariff from 22.00 to joining neighbouring layers with similar temperatures. Man- 6.00 and during weekends and the higher tariff from 6.00 to ufacturers usually take special care to minimise the mixing of 22.00 during week days. The prices vary between suppliers. water in the reservoir. This provides the consumer with a bet- We use a lower tariff price of 0.04320 e/kWh and a higher ter experience that is, the outgoing water stays at the almost tariff price of 0.07795 e/kWh [11]. the same temperature during a shower, unless all hot water is used up. Our simulator is able to reproduce this behaviour to a sufficient degree. 6 TESTING AND RESULTS At the beginning of the test, the generator of water consump- 5.2 Water consumption simulation tion produces a semi-random plan of consumption. The water heater is simulated with a one minute step for the specified A number of sources [6, 7, 8] for water consumption measure- duration of the experiment, usually several weeks. Water is ment were used to develop a simulation of water consumption used according to the schedule, while the heating of the water during weekdays and weekends. When a household with a heater is controlled by the tested strategy. The process is re- specified number of members is generated, each individual peated for other strategies using the same consumption sched- is assigned a semi-random consumption pattern. A specific ule. The whole experiment is ran multiple times with different consumption schedule is then generated based on the patterns consumption schedules. The result of the experiment are the of individuals, with added variance using Gaussian distribu- average price and discomfort for each of the tested strategies tions. We separate two types of events. 3 to 10 small events (Figure 2). (e.g. washing hands) per user are randomly scattered through- As anticipated, Bulk achieves the best comfort, which is out the day, taking one minute and using less than 1 litre of usually near zero. Other strategies with comparable comfort hot water. Large events (showers) happen 1 to 3 times per day achieve it at a much higher price. A generally best perfor- per user, and take between 5 and 30 minutes, with 1-6 litres mance is achieved by On-Off Control, using only the heat of water per minute at a mean 38◦C. pump heater. Most of our static strategies are dominated by Both real and simulated water consumptions vary greatly On-Off Control and Bulk. on a day to day basis. Figure 1 shows the distribution of sim- Each strategy has a number of parameters that can be var- ulated hot water consumption over a longer time period. ied to achieve different results. By varying the boundaries of On-Off Control we produce three fronts, one for each type of heating (figure 2). Varying the parameters of Oracle strategies would produce another front. Ideal solutions would dominate On-Off Control – type strategies, while being dominated by Oracle strategies. 7 Conclusion The project is aiming to develop intelligent strategies for the scheduling of water heating in commercial water heaters. So Figure 1: Simulated hot water consumption, averaged over far we have developed a complete testing environment for 50 week days for 100 different households. comparing different strategies. We have implemented and tested most commercially used strategies. For comparison we have also implemented one Oracle strategy that achieves the best comfort possible. 5.3 Discomfort and price In order to find a good approximation of the Pareto front Different strategies under different water consumption pro- for our consumption simulator, we intend to develop a set of files were evaluated using discomfort and price as criteria. Oracle strategies that are capable of achieving a specific trade- Discomfort for a minute of our simulation is defined as: off of price and comfort. We also plan to utilise evolutionary algorithms to optimise 0 if To ≥ Tr the various static strategies. Finally, we want to develop in- discomf ort = (1) (Tr−To)∗V if T telligent strategies that adapt by learning from the past. 1000 o < Tr An immediate application of our system is to provide the where Tr is the requested temperature by the user, To is the user with a more intuitive way of choosing the most appropri- outflow temperature and V is the volume of water with the ate strategy. As it stands, users manually choose the On-Off 7 Figure 2: Averaged price and discomfort for a number of static strategies and Bulk. The colours dark cyan, blue and green represent static strategies using only the electric heater, only the heat pump and both, respectively. Strategies coloured orange are variations of Intervals. Other static strategies are coloured purple. Oracle strategies are black. Household with 4 members, using a 230 L water heater with a 1500 W electric heater and a heat pump with a heating power of 2000 W and a COP of 3.3 at 35oC. Control settings, which is generally the preferred water tem- Strategies for Demand Response. In Power and Energy perature. With our simulation, users would only need to de- Society General Meeting, IEEE (2012): 1-8. cide on a price to comfort trade-off, and the controller would choose the best strategy and settings to improve their comfort [6] Energy Monitoring Company, Energy Saving Trust. while lowering their costs. Measurement of Domestic Hot Water Consumption in Dwellings (2008). References [7] B. Schoenbauer, D. Bohac, M. Hewett. Measured Resi- dential Hot Water End Use. ASHRAE Transactions 118 [1] P. Du, N. Lu., Appliance Commitment for Household (2012): 872-889. Load Scheduling. Smart Grid, IEEE Transactions on Smart Grid 2, no. 2 (2011): 411-419. [8] Equipment Energy Efficiency. Water Heating Data Col- lection and Analysis, 2012. [2] L. Paull, H. Li, L. Chang. A novel domestic electric water heater model for a multi-objective demand side manage- [9] Coolwex. DSW 300 - Navodila za inštalacijo in uporabo. ment program. Electric Power Systems Research 80, no. Klima center Horizont, Maribor, 2010. 12 (2010): 1446-1451. [10] Kronoterm. Katalog produktov, 2014. URL http:// [3] B. J. LaMeres, M. H. Nehrir, V. Gerez Controlling the www.kronoterm.com/wp-content/uploads/ average residential electric water heater power demand 2014/katalog-40-mar-2014-web.pdf. Ac- using fuzzy logic. Electric Power Systems Research 52, quired 20.8.2014. no. 3 (1999): 267-271. [11] Statistični urad Republike Slovenije. Podatkovni por- [4] A. Moreau. Control Strategy for Domestic Water Heaters tal SI-STAT, Okolje in naravni viri, Seznam tabel. during Peak Periods and its Impact on the Demand for URL http://pxweb.stat.si/pxweb/Dialog/ Electricity. Energy Procedia 12 (2011): 1074-1082. statfile2.asp [5] R. Diao, S. Lu, M. Elizondo, E. Mayhorn, Y. Zhang, N. Samaan. Electric Water Heater Modeling and Control 8 ANALIZA NAKUPOV IN MODELIRANJE POSPEŠEVANJA PRODAJE V SPLETNI TRGOVINI Matija Č erne (Fakulteta za matematiko in fiziko, Jadranska 19, 1000 Ljubljana, Slovenija), Boštjan Kaluža, Mitja Luštrek Odsek za inteligentne sisteme, Inštitut Jožef Stefan Jamova cesta 39, 1000 Ljubljana, Slovenija Tel: +386 1 4773419; fax: +386 1 4251038 e-mail: matija.cerne@student.fmf.uni-lj.si bostjan.kaluza@ijs.si mitja.lustrek@ijs.si POVZETEK Večina analize je temeljila na dveh datotekah, katerih izseka Analizirali smo podatke o nakupih v spletni trgovini. vidimo spodaj: Cilja sta bila ugotoviti u činek spremembe cene na potrošnjo in identifikacija potrošnikovih preferenc v šifra nakupa šifra izdelka količina cena opis izdelka nekem trenutku. Pri analizi smo uporabljali tako 1 349908 150502 1 1.49 Dzem Eta. 450g mikroekonomske kot tudi statistične pristope. V grobem 2 386589 150502 1 1.49 Dzem Eta. 450g lahko metode analize razdelimo na dva sklopa – tiste, ki 3 384333 150502 1 1.49 Dzem Eta. 450g se osredotočajo na uporabnika in tiste, pri katerih je 4 350190 150502 1 1.49 Dzem Eta. 450g pomemben le artikel. 5 350564 150502 1 1.49 Dzem Eta. 450g 6 350550 150502 1 1.49 Dzem Eta. 450g 1 UVOD 7 344657 150507 1 0.34 Sol Morska 1kg Zanimajo so nas predvsem rezultati, ki bi jih lahko uporabili 8 341269 150507 1 0.34 Sol Morska 1kg za priporočanje artiklov tako znanim (obstoječim), kot tudi 9 341373 150507 1 0.34 Sol Morska 1kg neznanim (novim) uporabnikom. Pri priporočanju gre za to, da čimbolj natančno ugotovimo, kateri izdelek bi nekega 10 345727 150507 1 0.34 Sol Morska 1kg uporabnika poleg kupljenih še utegnil zanimati, nato pa mu Podatki o nakupih (izsek). Št.vrstic: 15176 ta izdelek spletna prodajalna priporoči. Iz prodajalčevega vidika je to precej pomembno orodje pospeševanja prodaje, znesek šifra nakupa šifra uporabnika datum nakupa naročila še posebej v kontekstu spletne trgovine. V nasprotju s 1 334366 127348 3.47 2012-07-24 klasično trgovino lahko tu v vsakem trenutku vidimo uporabnikovo košarico, pa tudi uporabnika samega lahko 2 335402 37507 8.05 2012-07-27 identificiramo, kar v praksi (za ne-uporabnike raznih kartic 3 336527 248562 30.94 2012-08-02 zvestobe) ni izvedljivo. Priporočila se na spletni strani 4 336934 248562 1.49 2012-08-06 izvedejo v obliki seznama priporočenih izdelkov, kar je 5 337402 248562 1.34 2012-08-07 izvedljivo v realnem času, če smo podatke predhodno 6 337404 37507 9.16 2012-08-08 pravilno obdelali. 7 337634 249741 8.29 2012-08-08 8 337643 249741 100.58 2012-08-08 2 PODATKI 9 337648 248562 2.29 2012-08-08 Na voljo smo imeli podatke o vseh nakupih, ki so se v 10 337663 248562 17.24 2012-08-08 spletni trgovini zgodili med 24. Julijem 2012 in 15. Podatki o prodanih izdelkih (izsek). Št. vrstic: 347332 Januarjem 2014. Za posamezne izdelke tako vemo kdaj, koliko in po kakšni ceni so bili prodani. Ob kasnejši Poleg tega smo imeli tudi podatke o uporabnikih, ki so obdelavi smo sicer ugotovili, da obstaja možnost, da povedali ali je posamezen uporabnik fizična oseba ali določeni podatki manjkajo (predvsem na začetku obdobja), podjetje, ter poštno številko njegovega prebivališča. Ker je vendar je to upoštevano v analizi oziroma pri rezultatih. 9 cilj projekta priporočanje produktov fizičnim uporabnikom, 2.3 Obdelava podatkov smo se odločili da bomo obravnavali samo fizične osebe. Če Za nadaljno uporabo je bilo potrebno združiti podatke o bi obravnavali podatke obeh kategorij skupaj, bi zaradi naročenih izdelkih in uporabnikih – torej pogledati, kateri velike razlike v obsegu potrošnje, pa tudi zaradi specifičnih ‘order ID-ji’ pripadajo kateremu uporabniku in nato za potreb podjetnikov ki se navadno razlikujejo od potreb posamezno naročilo (order) združiti izdelke ki so bili 'običajnih' potrošnikov, verjetno dobili precej nezanesljive kupljeni. rezultate. Zato smo najprej iz podatkov o nakupih izločili Za potrebe cenovne analize je bilo potrebno podatke tiste, ki so jih opravila podjetja. Glede lokacije uporabnikov transformirati v takšno obliko, da lahko razberemo se nam v dosedanji analizi to ni zdel dovolj pomemben informacijo o potrošnji ob določeni ceni. Natančneje, dejavnik pri potrošnikovih odločitvah in temu nismo potrebovali smo neko mero za ‘moč potrošnje’ v določenem posvečali posebne pozornosti. cenovnem obdobju (torej obdobju med dvema Na voljo smo imeli še podatke o lastnostih izdelkov, vendar spremembama cene) in najbolj logična mera se je zdela teh podatkov nismo obravnavali. frekvenca nakupov (enota: št.izdelkov/dan): Ena od težav je bila ta, da nismo imeli točnih podatkov o datumih sprememb cen in smo tako datum spremembe morali aproksimirati z datumom, ko se je prvič zgodil nakup po novi ceni. To pa za izdelke, ki se ne kupujejo vsak dan (in takšnih je večina) pomeni, da se obdobja ko neka cena Tu je sicer nastopil problem določitve obdobij ko velja neka velja, lahko precej razlikujejo od resničnih obdobij. Ravno cena, saj kot smo že prej omenili, nimamo točnih datumov zaradi tega, pa tudi zaradi premajhne količine podatkov (kar sprememb. Problematični so bili predvsem primeri, ko se je bi imelo za posledico premalo zanesljive rezultate) smo se nek nakup zgodil po ceni pred spremembo, vendar je bil odločili, da v analizah, kjer je to pomembno, obravnavamo zabeležen datum komaj v obdobju, ko je veljala naslednja samo določeno število izdelkov, za katere imamo dovolj cena – tako se je večkrat zgodilo tudi, da smo imeli na isti podatkov. dan iste artikle prodane po različnih cenah. Možna razlaga za to je, da se je cena izdelka zabeležila ob izdaji računa, 2.2 Vizualizacija podatkov datum nakupa pa je obveljal kot datum plačila – ni namreč Graf prikazuje, kako so porazdeljeni uporabniki glede na nujno, da je bil račun takoj plačan. Kakorkoli, v takšnih število nakupov, ki jih opravijo (horizontalna os) in primerih je bilo potrebno ‘izravnati šum’ in naročila s staro povprečno vrednost nakupa (vertikalna os). Vsaka pika ceno postaviti v prejšnje obdobje, sicer bi imeli ob predstavlja enega uporabnika: nekaterih spremembah cene lahko hude distorzije v frekvenci nakupov. To je bilo (za nekatere obravnavane izdelke) narejeno kar ročno, saj bi bilo sicer pretežko dovolj dobro definirati, katerim naročilom je potrebno spremeniti datum. SLIKA 1: porazdelitev uporabnikov spletne trgovine Opazimo, da večina uporabnikov za svoj nakup zapravi okoli 70 eur, in v obravnavanem obdobju manj kot petnajstkrat nakupuje v spletni trgovini. SLIKA 2: Graf frekvenc nakupov za izdelek ‘Mineralna voda Radenska classic, kombiniran z grafom cen 10 3. ANALIZA - METODE IN REZULTATI vendar vseeno vsaj okvirno vidimo, kateri izdelki so bolj, Najprej smo analizirali potrošnjo v odvisnosti od cen kateri pa manj občutljivi na spremembe v ceni (v tabeli (cenovna analiza), nato pa še analizirali nakupovalne izgleda da je Alpsko 3,5 najbolj, 1,6 pa najmanj cenovno navade desetih najbolj zanimivih uporabnikov. stabilno). Vseeno ta metoda ni najbolj zanesljiva za napovedovanje potrošnje, saj lahko predvidevamo, da so 3.1 Cenovna analiza spremembe odvisne tudi od številnih drugih dejavnikov (npr Pri cenovni analizi raziskujemo, kako se potrošnja oglaševanje, substituti iz drugih trgovin, substituti ki jih (frekvenca nakupov) spreminja v odvisnosti od spremembe nismo upoštevali pri analizi, šum na podatkih, …). V tabeli v ceni. Z uporabo mikroekonomskega pojma cenovne je to najbolj vidno pri Alpskem mleku 3,5, kjer izgleda da elasti je 1% spremembe v ceni prinesel 2,42% spremembe čnosti smo poskušali oceniti vpliv spremembe cene na potrošnjo istega oziroma sorodnih izdelkov. Nato nas (pozitivne!) v potrošnji. Ko pogledamo na graf frekvenc, pa zanima tudi, kaj se dogaja ob specifi opazimo, da je potrošnja dobrega pol leta od začetka čni kratkotrajni spremembi cene – akciji. merjenja skoraj nič, torej se lahko upravičeno vprašamo, ali je to res (kar je malo verjetno, glede na to da gre za enega 3.1.2 Cenovna elasti najbolj prodajanih artiklov za katerega dobro vemo, da ni č nost Za u prišel v prodajo komaj pred enim letom) in če se je mogoče činek sprememb cene na potrošnjo istega izdelka cenovno elasti zgodila napaka pri knjiženju naročil – recimo če se je vmes čnost izračunamo tako: zamenjala koda izdelka in to ni bilo popravljeno v bazi podatkov. Za učinek spremembe cene nekega izdelka na potrošnjo 3.1.2 Analiza uč inkov akcij nekega drugega izdelka (substituta) potrebujemo koeficient Ob opazovanju grafov prometa (prihodki od prodaje na dan; križne elastičnosti. Ta nam pove, za koliko odstotkov se cena krat frekvenca) in cen v času za nekatere izdelke spremeni potrošnja dobrine B ob spremembi cene dobrine opazimo, da za kratkotrajne padce v ceni (akcije) frekvenca A za en odsototek: potrošnje za to obdobje naraste, kar seveda ni presenetljivo. Zanimivo pa je dejstvo, da se velikokrat potrošnja po koncu akcije (vrnitvi cene na isto ali višjo raven kot prej) ne vrne V prvem primeru pričakujemo, da bo vrednost negativna na raven pred akcijo, temveč ostane višja kot je bila tedaj. (če se cena poveča, se troši manj nekega izdelka), Ob tem velja poudariti, da na potrošnjo poleg samega posledično pa pri sorodnih izdelkih (komplementih) znižanja cene gotovo vpliva tudi to, da se ob akciji tudi pričakujemo, da se bo potrošnja ob nespremenjeni ceni poveča promocija za izdelek (npr. objava v katalogu, povečala. Za potrebe računanja križnih elastičnosti je bilo reklama po televiziji). Poglejmo ta efekt na grafu za ‘Mleko potrebno še enkrat naračunati frekvence nakupov (q), tokrat Lejko 1,5%’ : po datumih sprememb cene vseh ostalih izdelkov, ki jih opazujemo skupaj. Spremembe se seveda ne zgodijo samo enkrat, zato ob vsaki spremembi cene lahko izračunamo novo cenovno elastičnost (tako enostavno kot križno). Rezultate nato lahko aproksimiramo regresijsko ali pa izračunamo povprečje. Zaradi velike distorziranosti podatkov se je druga metoda izkazala za bolj primerno. V naslednji tabeli so predstavljeni rezultati (povprečne elastičnosti) za skupino substitutov ‘Mleka’, kjer so bili rezultati še najbolj skladni s pričakovanji: Po diagonali so vrednosti izračunane po prvi formuli, na SLIKA 3: Padec cene (akcija) je označ en z rdeč o elipso, promet pred in po akciji pa z modrima ostalih mestih pa po principu križne elasti č rtama čnosti. Razlaga vrednosti v tabeli (gledamo zadnjo vrstico). Če se cena mleka znamke 1,6 poveča za 1%, tedaj se potrošnja (število Pri obravnavanih izdelkih (100 artiklov z najvišjo prodajo) prodanih artiklov na dan) mlek 3,5, Alpsko 1,6 in Alpsko se je ta situacija večkrat ponovila. Za definirano akcijo (v 3,5 po vrsti poveča za 1,36 %, 1,1 %, in 1,06 %. Obenem se našem primeru je akcija definirana kot vsaka sprememba potrošnja mleka zmanjša za 1,02 %. cene za vsaj -4% in v trajanju največ tri tedne) smo izsledke Še vedno seveda ne moremo z gotovostjo trditi, da se bo predstavili v tabeli, kjer smo izračunali, kakšna je potrošnja spreminjala točno tako kot so vrednosti v tabeli, procentualna sprememba v prometu med (2. stolpec) in po 11 akciji (3. stolpec). Za izdelke, ki so imeli več akcij, smo izračunali povprečno spremembo, lahko pa bi uporabili tudi 3.2.2 Ciklič na potrošnja izdelkov kakšno drugo metodo aproksimacije – recimo z metodo Za izdelke, ki jih obravnavani uporabnik dovolj pogosto najmanjših kvadratov v tridimenzionalnem prostoru. kupuje, poskušamo ugotoviti, ali jih kupuje v časovnih Rezultati so predstavljeni v TABELI 1: Uč inki akcij (ki se intervalih in le-te identificirati. Tudi ta problem je občutljiv nahaja v Dodatku), vse vrednosti pa so v odstotkih. na število nakupov. Uporabna vrednost te informacije je v Tabela je urejena po zadjem stolpcu, torej nam pove, katere tem, da lahko v danem trenutku predvidimo, ali se bo zgodil izdelke (izmed 100 obravnavanih) se najbolj splača nakup nekega izdelka s strani obravnavanega uporabnika, postaviti v akcijo, če želimo pozitiven efekt na promet tudi ali ne. po akciji. Podatki (predvsem prvih nekaj) izgledajo precej Uporabimo statistični pristop – iščemo interval zaupanja, v nerealni in tako ekstremne vrednosti lahko pripišemo katerem bi se z neko verjetnostjo zgodil naslednji nakup. Ta motnjam pri podatkih. Vseeno lahko opazimo, da se v nam za določeno stopnjo (med 0 in 1) in ocene parametrov splošnem akcija prodajalcu iz vidika povečevanja prometa pove meje intervala, v katerem se nahaja neka slučajna splača – prvo se mu poveča promet zaradi povečane spremenljivka (naslednji nakup), ki je porazdeljena isto kot potrošnje, potem pa zaradi kombinacije povečane potrošnje so porazdeljeni podatki. Parametri so: perioda (povprečen in (ponovnega) dviga cene. Vendar pa lahko predvidevamo, čas, ki mine med dvema nakupoma določenega izdelka), da ob akciji zaradi učinka substitucije povzročimo padec standardni odklon (pove, kako močno varirajo časi med prometa za druge, podobne izdelke. nakupi), in datum zadnjega nakupa. Če privzamemo, da se trenutno nahajamo v času 2014-01- 3.2 Analiza nakupov uporabnikov 16 (prvi dan, za katerega nimamo več podatkov) Izbrali smo deset uporabnikov z največ nakupi, saj nam to predvidevamo, da bo uporabnik, v kolikor na ta dan opravi zagotavlja dovolj veliko količino podatkov za analizo nakup, kupil izdelke, ki so v TABELI 2: ciklič nost vsakega posebej. Iste metode kot so predstavljene v tem potrošnje ki se nahaja v Dodatku, obarvani rumeno (za te razdelku seveda lahko uporabimo tudi pri uporabnikih z izdelke je ‘trenutni’ datum 2014-01-16 znotraj intervala). manj nakupi, vendar se s tem (za nekatere metode) znatno Gledamo. zmanjša točnost napovedi. Za te metode bi bilo zato v V našem primeru je stopnjo zaupanja 0,9. Za izdelek s šifro primeru praktične uporabe smiselno določiti neko spodnjo 157869 ("Solata endivija", 2. vrstica v tabeli) bo tako glede mejo za število nakupov, ki jih je uporabnik že opravil. na naše podatke veljala napoved, da se bo naslednji nakup z verjetnostjo 90 % zgodil v obdobju med 16. in 30. 1. 2014. 3.2.1 Identifikacija zaželenih in nezaželenih izdelkov Pri tem je potrebno poudariti, da bi se v praksi ocene Radi bi opredelili odnos do izdelkov, ki jih obravnavani parametrov računale sproti, torej bi se z akumulacijo uporabnik kupuje. Natančneje, zanima nas, ali obstajajo podatkov natančnost napovedi povečevala. izdelki, za katere lahko sklepamo, da jih je uporabnik kupil le enkrat in nato nikoli več? Takšnih izdelkov potem temu 4. ZAKLJUČEK in njemu podobnim uporabnikom ne priporo čamo, saj Najprej smo opravili cenovno analizo, ki temelji na predvidevamo da uporabnik z izdelkom ni bil zadovoljen. podatkih o prodanih izdelkih. Cilj analize je bil predvsem V Dodatku je izsek grafa (GRAF 1 : nakupi uporabnika), ki raziskati, kako se potrošnja odziva na spremembe v ceni. Tu prikazuje nakupe uporabnika. Graf je precej velik smo ločili splošno obravnavo in obravnavo posebnih (natančneje, višina je število različnih artiklov, ki jih sprememb v ceni – akcij. Pridobljeni rezultati so bili v uporabnik kupi, v konkretnem primeru okoli 1000, dolžina nekaterih primerih pričakovani, v drugih nekoliko manj. pa število nakupov (251)). V vrsticah so predstavljeni Nato smo analizirali nakupovalne navade nekaterih izdelki, pika pa pomeni da je bil nek izdelek kupljen uporabnikov, kar je uporabno predvsem za potrebe (nakupi so predstavljeni na ordinatni osi). priporočanja in je tudi prvotni cilj projekta. Najprej smo se Verjetno nezaželeni izdelki za obravnavanega uporabnika so osredotočili na ‘negativno selekcijo’ priporočanja, torej smo tisti, ki se pojavijo na grafu le enkrat – na izseku so poskušali identificirati izdelke ki jim bomo dali negativno obarvani s sivo. Mera gotovosti za to, da smo pravilno utež. Tu je pomembno, da upoštevamo ‘mero gotovosti’, ki napovedali ‘nezaželene izdelke’ mora temeljiti na številu smo jo zaenkrat le opisno opredelili. Nato smo preverili, kaj nakupov, ki jih uporabnik opravi po tem, ko kupi lahko predvidimo o času nakupa nekega izdelka in po ‘nezaželen izdelek’ in na tipu izdelka (ali gre za izdelek ki statistični analizi prišli do zaključka, da za dovolj obsežno se sicer troši pogosto). količino podatkov lahko napovemo časovni interval, ko se Mogoče bi lahko tudi ugotovili, ali je uporabnik izdelek zgodi naslednji nakup in povedali, kako bi to lahko bilo zamenjal za nek substitut (temu bi potem ocena uporabno v smislu priporočanja. ‘zaželenosti’ narasla). Ta problem je sicer zelo občutljiv na število nakupov. 12 5. DODATEK TABELA 1: uč inki akcij TABELA 2: ciklič nost potrošnje Povprečna Povprečna sprememba sprememba izdelek prometa prometa med akcijo po akciji 1 149725 Toaletni papir PALOM 4458.97 9004.5 2 146861 Mleko trajno alpsko, 266.166 1621.48 3 147757 Napitek izotonicni S 591.6133 857.54 4 159161 Jogurt navadni, cvrs 2872.275 786.33 5 150673 Kuhinjske brisace PA 535.74 781.155 6 146485 Voda RADENSKA classi 821.8833 701.22 7 159133 Voda, namiz 199.06 687.22 8 164210 Mleko trajno Zelene 545.92 522.09 9 149226 Cvetaca 43.5433 517.35 10 159129 Voda gazira -10 421.43 11 149349 Kajzerica 55g 287.75 305.58 12 149998 Banane 191.528 246.56 13 150231 Kruh rzeni 276.896 198.98 14 147837 Jajca l,, 1 330.43 195.88 15 159057 Mleko trajno Zelene 339.415 134.4 16 147266 Mleko trajno lejko, 392.68 125.09 17 151099 Sok lumpi, jabolko, 424.26 79.36 18 146403 Pivo UNION, svetlo, 4.89 59.71 19 159143 Pivo, svetl 182.815 54.505 20 147274 Cokolada GORENJKA, t 52.25 47.84 21 151877 Keksi domacica origi 94.97 36.27 22 148445 Pivo LAsKO CLUB, piv 221.245 26.945 23 151988 Cokolada PR 23.94 16.87 24 149318 Sosedovo pecivo, s s 323.63 4.51 2 5 1 5 1 9 8 5 C o k o l a d a P R 1 7 . 8 6 4 . 2 9 26 159151 Radler gren 1821.7 0.78 27 156492 Cokolada GORENJKA ml 354.97 -21.12 28 151065 Cokolada BALI z rize -40.19 -22.7 29 164990 Cokolada MILKA noise 70.735 -50.39 GRAF 1: nakupi uporabnika 13 ANALIZA MOŽNOSTI ZAZNAVANJA PODOBNOSTI MED UPORABNIKI Božidara Cvetković , Mitja Luštrek Department of Intelligent Systems Jožef Stefan Institute Jamova cesta 39, 1000 Ljubljana, Slovenia e-mail: {boza.cvetkovic, mitja.lustrek}@ijs.si POVZETEK ali neomejena količina neoznačenih podatkov. Najbolj osnovna metoda je samo-učenje (self-training [1]), ki Prispevek predstavlja preliminarne rezultate analize uporablja en klasifikator za označevanje podatkov in ročno možnosti zaznavanja podobnosti med uporabniki. Cilj nastavljen prag za odločitev o izbiri podatka za dodajanje v analize je izbrati najboljši pristop, ki bo uporabljen v učno množico. Prag je po navadi nastavljen tako, da mora metodi za prilagajanje modela uporabniku MCAT. biti zaupanje v napoved 100%. Nadgradnja metode z enim klasifikatorjem je dodajanje več klasifikatorjev, ki so 1 UVOD IN SORODNO DELO naučeni z različnimi algoritmi in za dodajanje uporabljajo V aplikacijah, kjer se uporabljajo modeli strojnega učenja za večinski glas (Democratic co-learning [2]) ali več napovedovanje človeškega obnašanja, se pogosto dogaja, da klasifikatorjev z istim učnim algoritmom in več dimenzijami točnost delovanja v realnem okolju ni primerljiva točnosti (Co-training [3]). Pomanjkljivost prvega je v ročno delovanja v laboratorijskem okolju. Razlog je tako omejena nastavljenem pragu (100% zaupanje v napoved), problem količina učnih podatkov, kot tudi fizična razlika ter razlika v drugega pa kompleksnost delitve prostora na dva navadah med ljudmi. Fizične razlike se kažejo bodisi v ortogonalna dela ali dimenziji. Več o metodah pol- drugačnosti izvajanja akcij v primeru problema nadzorovanega učenja pišemo v našem preteklem delu, kjer prepoznavanja aktivnosti ali v drugačnem metabolnem smo prilagajali klasifikator za prepoznavanje aktivnosti sistemu v primeru problema ocene porabe energije. novemu uporabniku. Pokazali smo, da lahko z mehanizmom Točnost modela za določenega uporabnika lahko zvišamo na za prilagajanje novemu uporabniku (MCAT - Multi- dva načina: Classifier Adaptive Training [4]) in omejeno količino na • označimo dodatne učne podatke specifične za novo označenih podatkov (3 aktivnosti po 30 sekund) novega uporabnika in uporabimo nadzorovano zvišamo prepoznavanje aktivnosti za približno 12 odstotnih učenje za nov model ali točk. Ogrodje MCAT metode je okvirno predstavljena na • uporabimo katero od metod, ki nenadzorovano sliki 1. ali pol-nadzorovano prilagodijo model trenutnemu uporabniku. Neoznačena instanca Najboljše izboljšanje dobimo z označevanje dodatnih Učna množica Učna množica osnovnega specifičnega podatkov. Vendar je ta proces časovno zelo zahteven, modela modela duhamoren in drag, tako za označevalca kot za uporabnika. Velikokrat se zgodi, da je samo označevanje podatkov v Osnovni model Specifični model ciljnem okolju onemogočeno, bodisi zaradi samega Posodobi klasifikacijskega problema (označevanje padcev je lahko nevarno) ali pa zato, ker nam manjkajo dodatne naprave, ki Ponovno učenje Selekcija niso mobilne in jih lahko uporabljamo izključno v modela laboratoriju (poraba človeške energije iz izdihanega zraka). V tem primeru se izkažejo rešitve, ki uporabljajo pol- Dodajanje nadzorovano učenje, bolj primerne. Metode pol- Označena instanca nadzorovanega učenja označijo neoznačene podatke in glede na določeno pravilo izberejo ali zavržejo trenutni podatek za dodajanje v učno množico. Nad učno množico, ki vsebuje Slika 1: Ogrodje metode MCAT nove podatke, se nato uporabi nadzorovan algoritem za strojno učene za pridobitev novega, prilagojenega modela. Ogrodje MCAT pričakuje naslednje klasifikatorje: Metode pol-nadzorovanega učenja lahko kategoriziramo na • osnovni model: model, ki se je v laboratorijskem okolju več načinov. Glede na število klasifikatorjev, glede na izkazal za najboljšega, število dimenzij (ortogonalnost atributnih vektorjev), glede • specifični model: model ali množica modelov, ki na način prilagajanja in glede na to ali se uporablja omejena vsebujejo znanje o specifikah trenutnega uporabnika, 14 • selekcija: model, ki izbere končno oznako, katere podobnost smo ugotavljali. Cilj je pridobiti množico • dodajanje: model, ki se odloča, ali je trenutna instanca oseb, ki so najbolj podobni novemu uporabniku in iz njihovih dovolj kvalitetna za dodajanje v učno množico modelov oceniti porabo energije novega uporabnika. osnovnega modela. Točnost ocene mora biti višja od splošnega modela, ki je Cilje trenutne raziskave je uporabiti isto ogrodje na naše izhodišče. regresijski domeni, bolj specifično za oceno porabe človeške energije. Označevanje podatkov za novega uporabnika je v 3.1 Razvrščanje v skupine ali gručenje tem primeru onemogočeno, saj bi uporabnik moral v Za razvrščanje v skupine smo uporabili algoritem k-means iz laboratorij, kjer se nahajajo potrebne naprave (Cosmed orodja za strojno učenje Weka [6]. Za idealno število gruč k4b2). smo uporabili koeficient Silhouette, ki poda mero, kako Ta prispevek predstavlja analizo pristopov za možnost dobro podatek ustreza trenutni gruči. Koeficient je definiran detekcije podobnosti med uporabniki. Privzeli bi, da pristop z naslednjo enačbo. z najboljšim delovanjem opiše trenutnega uporabnika ܾሺ݅ሻ − ܽሺ݅ሻ zadosti dobro da ga lahko uporabimo kot specifični model v ݏሺ݅ሻ = max {ܽሺ݅ሻ,ܾሺ݅ሻ} MCAT algoritmu. −1 ≤ ݏሺ݅ሻ ≤ 1 2 NABOR PODATKOV V raziskavi smo uporabili dva nabora podatkov in sicer Za izračun koeficienta uporabnika i uporabimo podatke, ki so uporabljeni kot učna množica splošnega • a(i) - povprečna razdalja vseh uporabnikov v modela in pa nabor podatkov, ki predstavlja bio-impedanco gruči oseb vsebovanih v učni množici splošnega modela. • b(i) - najmanjša razdalja trenutnega uporabnika do sosednje gruče Učna množica splošnega modela je bila zbrana v Ustreznost gruče je definirana z velikostjo koeficienta. kontroliranem laboratorijskem okolju Fakultete za Šport in Najbolj ustrezna delitev je pri s(i) = 1, če je koeficient blizu vsebuje podatke 10 ljudi, ki so izvajali vnaprej določene 0 je na robu dveh gruč in če je -1 verjetno bolj ustreza drugi sklope aktivnosti. Opremljeni so bili s pospeškomeri na gruči. Izračunan koeficient za tri osebe lahko vidimo na Sliki prsih in stegnu, prsnim pasom za merjenje srčnega utripa, 2. Za osebi A in B je najboljša delitev na dve gruči in za napravo Senswear, ki meri oddajanje toplote človeka, osebo H na 5 gruč. galvanski odziv kože in telesno temperaturo ter oceni človekovo porabo energije in indirektnim kalorimetrom 0.80 Cosmed k4b2, ki meri porabo energije na osnovi izdihanega 0.70 0.63 0.62 0.62 ogljikovega dioksida in porabe kisika. Ta nabor podatkov je 0.75 0.58 0.49 0.55 0.60 ette 0.52 u 0.46 bil uporabljen za gradnjo in vrednotenje ve o č regresijskih 0.50 0.40 0.43 0.40 ilh modelov za oceno porabe energije. Izbran je bil najboljši, ki 0.40 t S 0.46 0.46 n 0.47 0.46 vsebuje podatke pospeškomerov, blizu telesne temperature 0.30 0.07 eficie 0.07 in sr 0.20 čnega utripa. Ta model je privzet za splošni model, o 0.33 0.03 0.04 0.31 K 0.31 0.10 0.25 deluje s povprečno absolutno napako (MAE) 0.55 MET 0.00 (Metabolic Equivalent of Task). 1 2 3 4 5 6 7 8 9 10 Število gruč Učna množica bio-impedance so podatki, pridobljeni iz Oseba A Oseba B Oseba H naprave InBody [1], ki analizira sestavo telesa. Podatki Slika 2: Silhouette koeficient ustreznosti delitve. vsebujejo: višino, težo, starost, količino vode v celicah, izven celic, količino proteinov, mineralov, maščobe, maso skeleta, S to metodo smo dobili gruče podobnih oseb. indeks telesne teže, razmerje med pasom in boki in podatke o teži udov. Vsebuje tudi maksimalne in minimalne 3.2 Meta klasifikacija vrednosti za vsak tip podatkov, kar smo uporabili na normalizacijo in dodali se maksimalen in minimalen srčni Za uteževanje ocen smo poizkusili še meta-klasifikator za utrip uporabnika. Ta je bil umerjen med 15 minutnim vsako osebo posebej. Za meta-klasifikator smo uporabili ležanjem (minimalen srčni utrip) in po dveh minutah podatke osmih oseb pri ocenjevanju devete. Za končne intenzivnega teka (maksimalen srčni utrip). To učno evaluacijo smo uporabili deseto osebo. množico smo uporabili za ugotavljanje podobnosti med Začetno množico atributov meta klasifikatorja sestavljajo uporabniki. naslednji atributi: • evklidske razdalje od trenutne osebe do vseh 3 PRISTOP ZA UGOTAVLJANJE PODOBNOSTI oseb v gruči, MED UPORABNIKI • trenutna razpoznana aktivnost osebe, Podobnost med uporabniki smo analizirali z uporabo nabora • nivo aktivnosti (nizka, srednja, visoko), podatkov bio-impedance in testirali na podatkih osebe, • normaliziran srčni utrip osebe, 15 Tabela 1: Rezultati glede na pristop ugotavljanja podobnosti med uporabniki. Pristopi so opisani v sekciji 3.3. Pristopi Splošni model (MAE) Število gruč Število oseb v gruči A B C D E F Oseba A 0.49 2 8 0.53 0.49 0.49 0.49 0.48 0.48 Oseba B 0.69 2 3 0.77 0.69 0.70 0.69 0.73 0.69 Oseba C 0.64 3 4 0.75 0.60 0.61 0.60 0.58 0.59 Oseba D 0.55 4 1 0.93 0.54 0.54 0.54 0.48 0.49 Oseba E 0.44 2 8 0.40 0.47 0.48 0.47 0.44 0.44 Oseba F 0.55 2 8 0.68 0.60 0.60 0.60 0.55 0.55 Oseba G 0.57 2 8 0.50 0.61 0.62 0.61 0.56 0.56 Oseba H 0.46 5 2 0.42 0.51 0.51 0.51 0.46 0.46 Oseba I 0.64 2 8 0.67 0.63 0.63 0.63 0.72 0.63 Oseba J 0.50 6 1 0.65 0.47 0.47 0.47 0.53 0.50 Povprečno 0.55 0.63 0.56 0.56 0.56 0.55 0.54 • povprečna absolutna napaka ocene modela Pristop B: Vsako instanco ocenijo modeli oseb, ki so v gruči osebe glede na oceno splošnega modela, ,in končna ocena je povprečje ocen. • cona srčnega utripa po metodi Zoladz [8], • procent povprečne absolutne napake ocene Pristop C: Vsako instanco ocenijo modeli oseb, ki so v gruči, modela glede na oceno splošnega modela. in končna ocena je utežena vsota glede na evklidsko razdaljo do centroide v gruči. Delovanje meta-klasifikatorja je naslednje. Vsako instanco se oceni z modeli oseb, ki so v gruči, in vsaka ocena je Pristop D: Vsako instanco ocenijo modeli oseb, ki so v gruči, ovrednotena s svojim meta-klasifikatorjem, ki vrne enega od in končna ocena je utežena vsota glede na evklidsko razdaljo dveh razredov: »da« ali »ne«. Da pomeni, da se ocena do nove osebe v gruči. Če je v gruči ena oseba je rezultat uporabi, in ne, da se zavrže. Poleg vsake klasifikacije utežena vsota splošnega modela in modela osebe. klasifikator vrne stopnjo zaupanja v svojo napoved. Končna ocena se izračuna glede na število modelov, katerih rezultat Pristop E: Za oceno so uporabljeni meta klasifikatorji in je bil »da«: modeli vseh oseb. - število »da« > 1; normalizira se stopnja zaupanja za vsak model, ki je klasificiral »da«. Normalizirane Pristop F: Za oceno so uporabljeni meta klasifikatorji in stopnje se uporabijo kot utež trenutne ocene in modeli oseb v gruči. utežena vsota vseh tvori končno oceno, - število »da« = 1; stopnja zaupanja je uporabljena 4 REZULTATI kot utež ocene tega modela. Ostanek je uporabljen kot utež ocene splošnega modela. Utežena vsota Rezultati predstavljajo evaluacijo vseh omenjenih pristopov. obeh tvori kon Cilj je izbrati pristop, ki vrača manjšo ali primerljivo točnost čno oceno, - število »da« = 0; uporabi se ocena splošnega splošnemu modelu. Rezultati so predstavljeni v Tabeli 1 in modela sicer z povprečno absolutno napako (MAE) definirano z naslednjo enačbo: ௡ Uporabnost atributov smo ovrednotili s kombiniranjem vseh 1 ܯܣܧ = in izločili tiste atribute, ki ne pripomorejo k boljši točnosti ݊ ෍หܧܧ௢௖௘௡௝௘௡௔ − ܧܧ௣௥௔௩௔ห izbire in hkrati točnosti ocene. Atributi, ki so ostali v ௜ୀଵ Končna ocena najboljšega pristopa je ocenjena z povprečno končnem vektorju atributov, so: absolutno procentualno napako definirano z naslednjo • evklidske razdalje od trenutne osebe do vseh enačbo (MAPE): oseb v gruči, 100% ௡ ܧܧ • trenutna razpoznana aktivnost osebe, ܯܣܲܧ = ௢௖௘௡௝௘௡௔ − ܧܧ௣௥௔௩௔ ݊ ෍ ቤ ܧܧ ቤ • nivo aktivnosti (nizka, srednja, visko), ௣௥௔௩௔ ௜ୀଵ • cona srčnega utripa po Zoladz metodi [8]. V obeh enačbah EEocenjena predstavlja oceno porabe energije, 3.3 Pristopi kot jo vrne regresijski model in EEprava je izmerjena poraba Pristop A: Vsako instanco oceni devet modelov (posamezni energije. model osebe) in kon Točnost splošnega modela je predstavljena v drugem stolpcu čna ocena je povprečje ocen. Tabele 1. Povprečna napaka modela je 0.55 MET in MAPE modela je 25%. Prvi pristop (pristop A) uporabi povprečno 16 oceno vseh oseb. Iz rezultata lahko vidimo, da se napaka [8] Zoladz, http://en.wikipedia.org/wiki/Heart_rate poveča in da ta pristop ni pravilen, kar je tudi v skladu s hipotezo, da uporabljen model mora biti podoben modelu končne osebe. Pristop B uporabi dodatno znanje o medsebojni podobnosti oseb in za končno oceno uporabi povprečje ocen podobnih oseb (osebe v isti gruči). Rezultat je slabši od splošnega modela, tako v obliki MAPE 26% kot tudi MAE 0.56 MET. Pristop C uporabi za utež napovedi evklidsko razdaljo osebe do centroide. Končna točnost je slabša od splošnega modela in sicer 0.56 MET in 26% v obliki MAPE. Pristop D vrne primerljive rezultate kot pristopa B in C. Pristop E uporabi meta-klasifikator, vendar na vseh osebah. Iz rezultata lahko vidimo, da z vpeljavo meta klasifikatorja dosežemo primerljivo točnost, kot ga dobimo s splošnim modelom. Če uporabimo meta klasifikatorje samo na osebah ki so v gruči, pa pridobimo na točnosti in sicer 0.01 MET v obliki MAE in 3 odstotne točke v obliki MAPE. 5 ZAKLJUČEK Ta prispevek predstavlja preliminarne rezultate analize pristopov za ugotavljanje podobnosti med uporabniki. Analiza je bila narejena na domeni ocene porabe človeške energije z namenom definirati specifični model za pol- nadzorovano metodo MCAT, katero bomo v prihodnjem delu nadgrajevali. Pristop, ki vrača najboljšo točnost, uporablja algoritem gručenja za delitev oseb v skupine po podobnosti in meta klasifikatorje posameznih oseb v gruči za končno oceno porabe energije osebe. Z uporabo pristopa za podobnost izboljšamo rezultat najboljšega modela za 3 odstotne točke. Prihodnje delo zajema razširitev pristopov in uporabo najboljšega pristopa v metodi MCAT. References [1] Frinken, V., Bunke, H.: Self-training Strategies for Handwriting Word Recognition. In: Perner P. (eds.) Advances in Data Mining. Applications and Theoretical Aspects. LNCS, vol. 5633, pp. 291--300, 2009. [2] Zhou, Y., Goldman, S.:Democratic Co-Learning. In: 16th IEEE International Conference on Tools with Artificial Intelligence, pp. 594--602, IEEE press, 2004 [3] Blum, A. and Mitchell, T.: Combining labeled and unlabeled data with co-training. In: 11th annual conference on Computational learning theory, pp. 92-- 100, 1998. [4] B. Cvetković, B. Kaluža, M. Luštrek, M. Gams, “Adapting Activity Recognition to a Person with Multi- Classifier Adaptive Training,” Journal of Ambient Intelligence and Smart Environments. Accepted for publication, 2014. [5] InBody, http://www.e-inbody.com/ [6] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I. H. Witten. The WEKA Data Mining Software: An Update. SIGKDD Explorations. 11(1), pp. 10–18, 2009. [7] Silhouette koeficient, http://en.wikipedia.org/wiki/Silhouette_(clustering) 17 VISUALIZATION OF EXPLANATIONS OF INCREMENTAL MODELS Jaka Demšar, Zoran Bosnić , Igor Kononenko University of Ljubljana, Faculty of Computer and Information Science Večna pot 113, SI-1000 Ljubljana, Slovenia e-mail: jaka.demsar0@gmail.com, {zoran.bosnic, igor.kononenko}@fri.uni-lj.si ABSTRACT Another method, Page-Hinkley test [10] was devised to The temporal dimension that is ever more prevalent in detect the change of a Gaussian signal and is commonly data makes the data stream mining ( incremental used in signal processing. learning) an important field of machine learning. In Bare prediction quality is not a sufficient property of a good addition to accurate predictions, explanations of models machine learning algorithm. Explanation (a form of data and examples are a crucial component as they provide postprocessing) of individual predictions and model as a insight into model's decision and lessen its black box whole is needed to increase the user's trust in the decision nature, thus increasing the user's trust. Proper visual and provide insight in the workings of the model, which representation of data is also very relevant to user's increases the models credibility. IME ( Interactions-based understanding -- visualization is often utilised in Method for Explanation) [13] with its efficient adaptation machine learning since it shifts the balance between [12] is a model independent method of explanation, which perception and cognition to take fuller advantage of the also addresses interactions of features and therefore brain's abilities. In this paper we review visualisation in successfully tackles the problem of redundant and incremental setting and devise an improved version of disjunctive concepts in data. The explanation of the an existing visualisation of explanations of incremental prediction for each instance is defined as a vector of models. We discuss the detection of concept drift in data contributions of individual feature values. Positive streams and experiment with a novel detection method contibution implies that the particular feature value that uses the stream of model's explanations to positively influenced the prediction (and vice versa) while determine the places of change in the data domain. the absolute value of a contributon is proportional to the magnitude of influence on the decision, i.e. the importance 1 INTRODUCTION of that feature value. The sum of all contributions is equal to the difference between the prediction using all feature values Data streams are becoming ubiquitous. This is a and a prediction using no features (prediction difference). consequence of the increasing number of automatic data The explanation of a single prediction can be expanded to feeds, sensoric networks and internet of things [1]. The the whole model [12] and also to incremental setting [3]. In defining characteristics of data streams are their transient the latter case, drift detection (SPC) and adaptation are used dynamic nature and temporal component. In contrast with to compensate for concept drift. Explanation of a data stream static datasets (used in batch learning), data streams (used in is therefore itself a data stream. incremental learning) are large, changing, semi-structured Related to exaplantion is data visualisation - a versatile tool and possibly unlimited. This poses a challenge for storage in machine learning that serves two purposes; sense-making and processing as the data can be only read once. For (data analysis) and communication as it conveys abstract incremental learning models, operations of model increment concepts in a form, understandable to humans (it shifts the and decrement are vital. Concepts and patterns in data balance between perception and cognition to take fuller domain can change ( concept drift) - we need to adapt to this advantage of the brain's abilities [4]). The majority of phenomenon or the quality of our predictions deteriorates. published visualizations depict data that has a temporal According to PAC ( Probably approximately correct) component [8]. In this context, visualization acts as a form learning model, if the distribution, generating the instances of summarization, since the datasets can be extremely large. is stationary, the error rate for sound machine learning The challenge lies in representing the temporal component algorithms will decline towards the Bayes error rate as the (including concept drift), especially if we are limited to two- number of processed instances increases [9]. Consequently, dimensional non-interactive visualisations. when a statistically significant rise in error rate is detected, The main goal of this paper is improving the existing we can suggest that there has been a change in the methodology for visualising explanations of incremental generating distribution - concept drift. models [3]. The feature value contributions are represented The basis of statistical process control ( SPC) [5] is with customised bar charts. Multiple such charts are detecting statistically significant error rate (using the central required to explain the model at diffrerent points in time. limit theorem) by monitoring the mean and standard They become very difficult to read as a whole because of deviation of a sequence of correct classification indicators. the large number of visual elements that we have to compare (we sacrifice macro view completely in favour of 18 micro view). To consolidate these images and address the out (micro view), while general patterns and trends can be change blindness phenomenon, charts are stacked into a recognised in the shapes of lines that are intuitive single plot, where the age and size of the exaplanation are representations of flowing time (macro view). The resulting represented with transparency (older and "smaller" visualisations are dense with information, easily explanations fade out). The resulting visualisation is not understandable (conventional plotting of independent tainted by first impressions (as it is only one image) and is variable, time, on x axis) and presented in gray-scale palette, adequately dense and graphically rich. However, the major making them more suitable for print. flaw of this approach lies in the situations when columns, representing newer explanations override older ones and 3 DETECTING CONCEPT DRIFT USING THE thus obfuscate the true flow of changing explanations, for STREAM OF EXPLANATIONS example, when the concept drift precipitates the attribute When explaining incremental models, the resulting value contributions to increase in size without changing the explanations are, in themselves, a data stream. This gives us sign. Concepts can therefore become not only hidden; the option to process it with all the methods used in what's more, the visualization can be deceiving, which we incremental learning. In our case, we'll devise a method to consider to be worse than just being too sparse. Therefore, detect outliers in the stream of explanations and declare such we need to clarify the presentation of the concept drift along points as places of concept drift. The reasoning behind this is with an accurate depiction of each explanation's the notion that if the model does not change, then also the contributions while maintaining the macro visual value, that explanation of the whole model will not change. When an enables us to detect patterns and get a sense of true concepts outlier is detected, we consider this to be an indicator of a and flow of changes behind the model. significant change in model and thus also in the underlying An additional goal was to devise a method of concept drift data. In addition to this, the method provides us with a detection which monitors the stream of explanations and stream of explanations that is continuous to a certain degree detects anomalies in it; the detected anomalies are of granularity and so enables us to overview the concepts interpreted as a concept drift. We test the improved behind the data at more frequent intervals than the existing visualization and the novel concept drift detection method explanation methodology. on two datasets and evaluate the results. We use a standard incremental learning algorithm [5] (learn by incrementally updating the model, decrement the models 2 VISUALISATION FOR INCREMENTAL MODELS if it becomes too big according to the parameter, rebuild the When visualising explanations of individual predictions, model if we detect change [6]) and introduce some horizontal bar charts are a fitting method also in the additional parameters. Granularity determines how often the incremental setting. Individual examples are always explanation of the current model will be triggered. The explained according to the current model which, in our case, generated stream of explanations (vectors of feature value can change. This is not an obstacle, since the snapshot of the contributions) will be compared using cosine distance. For model is in fact the model that classified the example. each new explanation, the average cosine distance from all This approach fails with explanations of incremental models other explanations that are in the current model, is as we need a new figure for each local explanation. To calculated. These values are monitored using the Page successfully represent the temporality of incremental models, Hinkley test. When the current average cosine distance from we use two variations of a line plot where the x axis contains other explanations has risen significantly, we interpret that as time stamps of examples and the splines plotted are various a change in data domain - concept drift. The last examples representations of contributions ( y axis). are then used to rebuild the model, the Page Hinkley statistic The first type of visualization (Figures 2 and 3) has one line and the local explanation storage are reset (to monitor the plot for each attribute. Contributions of values of the new model). individual attribute are represented with line styles. The The cosine distance is chosen because, in the case of mean positive and mean negative contribution of the explanations, we consider the direction of the vector of attribute as a whole are represented with two thick faded contributions to be more important than its size, which is lines. Solid vertical lines indicate the spots where very influential in the traditional Minkowski distances. The explanation of the model was triggered (and therefore page Hinkley test is used in favour of SPC because of its become the joints for the plotted splines), while dashed superior drift detection times [9] and the lack of need for a vertical lines mark the places where the actual concept drift buffer - examples are already buffered according to the occurs in data. The second type is an aggregated version granularity. The method is therefore model independent. (Figure 3) where the mean positive and mean negative contributions of all attributes are visualized in one figure. In 4 RESULTS these two ways we condense the visualization of incremental 4.1 Testing methodology and datasets models without a significant loss in information while still providing a quality insight into the model. Exact values of We test the novel visualisation method and the concept drift contributions along with timestamps of changes can be read detection method on two synthetic datasets, both containing multiple concepts with various degrees of drift between 19 them. These datasets are also used in previous work [3], so a direct assessment of visualization quality and drift detection performance can be made. The naive Bayes classifier and the nearest neighbour classifier are used. Their usage yields very similar results in all tests, so only results obtained by testing with Naive Bayes are presented. SEA concepts [11] is a data stream comprising 60000 instances with continuous numeric features xi ∈ [0,10], where i ∈ {1,2,3}. x1 and x2 are relevant features that determine the target concept with x1+x2 ≤ β where threshold β ∈ {7,8,9,9.5}. Concepts change sequentially every 15000 examples. Although the changes between the generated Figure 2: P eriodically triggered explanations (SEA). concepts are abrupt, class noise is inserted into each block. The instances of second dataset, STAGGER [2], represent geometrical shapes which are in the feature space described by size, color and shape. The binary class variable is determined by one of the three target concepts ( small ˄ green, green ˅ square, medium ˅ large ). 4500 instances are divided into four blocks (concept-wise) with examples mixing near the change points according to a sigmoid function, so the dataset includes gradual concept drift. 4.2. Improved visualizations Concept drifts in STAGGER dataset are correctly detected and adapted to as reflected in Figure 3. The defined concepts can be easily recognized from explanations triggered by the SPC algorithm - the change in explanation follows the change in concept. Windows generated by the vertical lines give us insight in local explanations of the model (where the concept is deemed to be constant). Disjunct concepts (2 and 3) and redundant feature values are all explained correctly (e.g. reduncacy of shape and disjunction of size values in concept 3). Figure 1 demonstrates how classifications of two Figure 3: E xplanations triggered at change detection instances with same feature values can be explained (STAGGER). completely differently at different times - adapting to change is crucial in incremental setting. This is also evident in the explanations on the STAGGER dataset yielded positive aggregated visualization, which can be used to quickly results. As depicted in Figure 4, the method correctly detects determine the importance of each attribute. concept drifts without false alarms and is in that regard For SEA dataset, explanations of instances are tightly similar to SPC method. The stream of explanations was corresponding to explanations of the model. As evident in similar to those obtained with other successful drift detection Figure 2, the shape of contributions of features reflects the methods. Choices of larger granulations yielded similar target concept; lower values increase the likelihood of results, but the change detection was obviously delayed. The positive classification and vice versa. Feature x concept drift was however never missed, provided that the 1 is correctly explained as irrelevant with its only contributions being the granulation was smaller that the spacing between sequential result of noise. changes in data. The delays of concept drift detection are correlated with the magnitude of change. For example, the 4.3. Concept drift detection last concept drift was detected with significant delay. In this regard, the proposed method is inferior to SPC algorithm - Evaluating the concept drift detection using the stream of the concept drift detection in noticeably delayed and we're also dependant on two parameters – granulation and alert threshold, so the generality of the method is diminished. When testing with SEA datasets, the concept drift was not correctly detected. Changing the granulation and Page Hinkley alert threshold parameter resulted in varying degrees Figure 1: Explanations of a single prediction at different of false alarms or non reaction to change (Figure 4). This times. behaviour can be attributed to a small magnitude of change that occurs in data - the difference between concepts in data 20 is quite small and continuous. However, when explaining possibilities for visualisation would emerge, particularly this (incorrectly adapted) model, we still recognise true those that rely on finely granular data, such as ThemeRiver underlying concepts. This can be attributed to automatically [7]. decrementing the model when it becomes too big. It is important to note that this does not perform well in general, if the prior knowledge is insufficient for us to correctly decide on the maximum model size. We conclude that, in this form, the presented method is not a viable alternative to the existing concept drift detection methods. Its downsides include high level of parametrization which requires a significant amount of prior knowledge and can also become improper if the model changes drastically. Consequently, another assessment of data is needed - the required manual supervision and lack of adaptability in this regard can be very costly and against the requirements of a good incremental model. The concept drift detection is also not satisfactory - it is delayed in the best case or concepts can be missed or falsely alerted in the worst case. Another downside is the time complexity - the higher the granularity Figure 4: Performance of various change detection methods. the more frequent explanations will be, which will provide Yellow line indicates true change in concepts, green line us with a good stream of explanations but be very costly indicates change detection and adaptaion). time-wise. The method is therefore not feasible in environments where quick incremental operations are vital. However, if we can afford such delays, we get a granular References stream of explanations which gives us insight into the model [1] C. C. Aggarwal, N. Ashish, A. P. Sheth. The internet of for roughly any given time. things: A survey from the data-centric perspective. In A note at the end: we should always remember that we are Managing and Mining Sensor Data. Springer, 2013. explaining the models and not the concepts behind the [2] A. Bifet, G. Holmes, R. Kirkby, B. Pfahringer, M. Braun. model. Only if the model performs well, we can claim that Moa: Massive online analysis. our explanations truly reflect the data domain [12]. This can [3] Jaka Demšar. Explanation of predictive models and individual be tricky in incremental learning, as at the time of a concept predictions in incremental learning (In Slovene). B.S. Thesis, drift, the quality of the model deteriorates. University of Ljubljana, 2012. [4] S. Few. Now You See It: Simple Visualization Techniques for Quantitative Analysis 5 CONCLUSION . Analytics Press, 1st edition, 2009. [5] J. Gama. Knowledge Discovery from Data Streams. Chapman The new visualization of explanation of incremental model is & Hall/CRC, 1st edition, 2010. indeed an improvement compared to the old one. The [6] D. Haussler. Overview of the probably approximately correct overriding nature of the old visualisation was replaced with (PAC) learning framework, 1995. an easy to understand timeline, while the general concepts [7] S. Havre, B. Hetzler, and L. Nowell. Themeriver: Visualizing (macro view) can still be read out from the shape of the theme changes over time. In Proc. IEEE Symposium on Information Visualization, 2000. lines. Micro view is also improved as we can determine [8] C. Ratanamahatana, J. Lin 0001, D. Gunopulos, E. J. Keogh, contributions of attribute values for any given time. M. Vlachos, and G. Das. Mining time series data. In The Data The detection of concept drift using the stream of Mining and Knowledge Discovery Handbook. Springer, 2005. explanations did not prove to be suitable for general use [9] R. Sebastião and J. Gama. A study on change detection based on the initial experiments. It has shown to be hindered methods. In Progress in Artificial Intelligence, 14th by delayed detection times, missed concept drift Portuguese Conference on Artificial Intelligence, EPIA 2009 occurrences, false alarms, high level of parametrization and [10] E. S. Page. Continuous Inspection Schemes. Biometrika, Vol. potential high time complexity. This provides motivation for 41:100-115, 1954. further experiments in this field, especially because the [11] W. N. Street and Y. S. Kim. A streaming ensemble algorithm for large-scale classification. In Proceedings of the 7th ACM stream of explanations provides good insight into the model SIGKDD international conference on Knowledge discovery with accordance to the chosen granulation. and data mining, KDD ’01, New York, NY, USA, 2001. The main goal of future research is finding a true adaptation [12] E. Štrumbelj and I. Kononenko. An efficient explanation of of IME explanation methodology to incremental setting, i.e. individual classifications using game theory. The Journal of efficient incremental updates of explanation at the arrival of Machine Learning Research, 11:1–18, 2010. each new example. Truly incremental explanation [13] E. Štrumbelj, I. Kononenko, and M. Robnik Šikonja. methodology would provide us with a stream of explanations Explaining instance classifications with interactions of subsets of finest granularity. In addition to this, a number of new of feature values. Data Knowl. Eng., 68(10):886–904, October 2009. 21 DETECTION OF IRREGULARITIES ON AUTOMOTIVE SEMIPRODUCTS Erik Dovgan1, Klemen Gantar2, Valentin Koblar3,4, Bogdan Filipič1,4 1 Department of Intelligent Systems, Jožef Stefan Institute, Jamova cesta 39, SI-1000 Ljubljana, Slovenia 2 Faculty of Computer and Information Science, University of Ljubljana, Večna pot 113, SI-1000 Ljubljana, Slovenia 3 Kolektor Group d.o.o., Vojkova ulica 10, SI-5280 Idrija, Slovenia 4 Jožef Stefan International Postgraduate School, Jamova cesta 39, SI-1000 Ljubljana, Slovenia erik.dovgan@ijs.si, kg6983@student.uni-lj.si, valentin.koblar@kolektor.com, bogdan.filipic@ijs.si ABSTRACT ous classification modes, including the detection of individual The use of applications for automated inspection of types of irregularities with binary classifiers and the classifi- semiproducts is increasing in various industries, including cation of all types of irregularities with a single classifier. the automotive industry. This paper presents the devel- The paper is further organized as follows. The problem of opment of an application for automated visual detection detecting irregularities on commutators is presented in Sec- of irregularities on commutators that are parts of vehi- tion 2. Section 3 describes the application for detecting ir- cle’s fuel pumps. Each type of irregularity is detected on regularities that was designed and implemented in a proto- a partition of the commutator image. The initial results type form for a specific production line. The experiments and show that such an automated inspection is able to reliably results from the development process are presented and dis- detect irregularities on commutators. In addition, the re- cussed in Section 4. Finally, Section 5 concludes the paper sults confirm that the set of attributes used to build the with some ideas for future work. classifiers for detecting individual types of irregularities 2 PROBLEM DESCRIPTION and the priority of these classifiers significantly influence the classification accuracy. Commutators are parts of electric motors that periodically re- verse the current direction between the rotor and the external 1 INTRODUCTION circuit. If the electric motor is installed in the vehicle’s fuel pump, it has to withstand the chemical stress, which is usu- Information technology (IT) is replacing human work in nu- ally not the case for other types of electric motors. There- merous domains. Such a technology is especially suitable fore, special graphite-copper commutators are produced for for repetitive non-creative procedures where high accuracy is this purpose. required. Automotive industry is introducing IT in various The production of graphite-copper commutators involves segments, for example in storage management and automated several stages. Among them the most critical one is soldering inspection of semiproducts. of graphite and copper parts of the commutator. The quality Automated inspection of semiproducts can be done by an- of the soldered joint is crucial for the quality of the commuta- alyzing data from several sources, such as sensors, lasers and tor since even the smallest joint irregularity is unacceptable. cameras. Utilization of cameras for this purpose has several During the soldering phase, four types of irregularities may advantages, e.g., it is fast, thus not slowing down the pro- occur: duction line, it is cheaper in comparison to highly specialized sensors, and the same hardware can be used for inspection of 1. metalization defect, i.e., there are visible defects on the heterogeneous semiproducts. metalization layer, This paper presents the development of an application for 2. excess of solder, i.e., more solder is applied than feasi- automated visual inspection of commutators as semiproducts ble, for automotive industry. This application processes images of commutators with computer vision algorithms to obtain 3. deficit of solder, i.e., less solder is applied than feasible, the attributes describing visual properties of the commutator. and These attributes are then used by machine learning algorithms 4. disoriented, i.e., the copper part is not appropriately ori- to classify the commutators, i.e., to determine whether or not ented with respect to the graphite part. they contain irregularities and, in the case they do, what is the type of irregularities. Experimental detection of irregular- The analysis of these irregularities showed that each type oc- ities was performed using various sets of attributes and vari- curs only on a specific part of the commutator. Consequently, 22 when visually inspecting the commutator, its image can be filter is used, where the size of the median window is an input partitioned into four segments, each showing the presence or parameter. the absence of an individual type of irregularity, and into the In the next step, a threshold function is used to eliminate rest of the image that can be disregarded since it contains no pixels that are not relevant for detecting the irregularities. information about the irregularities. Currently, these irregu- Each ROI is processed with a specific value of the binary larities are detected through manual inspection of the commu- threshold. This step results in a black (background) and white tators. This approach is time-consuming and its results may (relevant regions) image. be subjective. The goal of this research is to design and im- Connected pixels are then grouped together with the plement an automated visual inspection of commutators that connected-component labeling algorithm [5] in order to de- would overcome the weaknesses of the manual inspection. tect the connected regions. This enables to process relevant regions, i.e., particles, rather than single pixels. 3 AUTOMATED VISUAL INSPECTION OF In the last image processing step, a particle filter is used to GRAPHITE-COPPER COMMUTATORS remove small particles that can be present in the image due to noise. The size of the particles to be filtered is an additional The idea for the automated visual inspection of graphite- input parameter. copper commutators is to consist of three phases. Firstly, After the image is processed with the computer vision al- a digital image of the commutator is obtained. Secondly, gorithms, the following six attributes are calculated for each this image is processed using computer vision algorithms that ROI, i.e., for each type of irregularity: extract informative attributes. Finally, these attributes are used by classifiers to determine whether the irregularities are • the number of particles, present on the commutator and identify their type in the case of their presence. Before applying this inspection procedure • the cumulative size of particles in pixels, on the production line, the classifiers need to be built with machine learning algorithms. • the maximal size of particles in pixels, 3.1 Processing commutator images with computer • the minimal size of particles in pixels, vision algorithms • the gross/net ratio of the largest particle, and Commutator images are processed in several steps. Since the commutators are not properly aligned, their rotation angle and • the gross/net ratio of all particles. position in the image have to be determined first. The center of the commutator is detected by matching the image with These attributes are then used to build the classifiers and clas- the template image of the center. Next, the position of the sify the commutator images. commutator’s pin is found. The line between between the center of the commutator and the pin is used to determine the 3.2 Learning classifiers with machine learning al- rotation angle. gorithms The next step of image processing consists of determining The goal of the classifiers is to determine whether a commuta- four regions of interest (ROIs), one for each type of irregu- tor contains any irregularities. Two approaches were applied larities. Each ROI is obtained by applying the corresponding to solve this classification problem: binary mask to the image. Before applying the binary mask, the mask has to be properly positioned and rotated. To that 1. all the attributes were included in a single set of at- end, the information about the center of the commutator and tributes and a single classifier was built to classify the its rotation angle (obtained in the previous step) is used. As commutators into one out of five possible classes (either a result, four ROIs are obtained. They are further processed one of the four types of irregularities or no irregularity), with the same sequence of computer vision algorithms, where 2. each type of irregularity was detected with a binary clas- only the input parameter values of these algorithms are spe- sifier, where the binary classifiers were prioritized to cific for each ROI. determine the irregularity when irregularities of several At this stage, ROIs are in RGB format. However, prelim- types were detected. inary tests showed that in order to reliably detect the irregu- larities, only one color plane should be used. Moreover, these The classification approach using four binary classifiers was tests also showed, that the most appropriate color plane is the further structured based on the attributes and learning in- red one, with the exception of the excess of solder irregularity stances used when building the binary classifiers. Specifi- for which the best color plane is the blue one. Consequently, cally, when building a binary classifier for detecting irregu- the most appropriate color plane is extracted from each ROI larities of a particular type, four learning modes were tested: with respect to the observed irregularity. This extraction re- sults in gray-scale ROIs. 1. only attributes of the corresponding ROI and only com- Gray-scale ROIs are then filtered with the median filter to mutators that are either without irregularities or contain reduce noise from the images. For this purpose a 2D median irregularities of this particular type are used, 23 Class Number of images Learning mode Best priority Max. accuracy [%] Without irregularities 212 1 C1, C3, C2, C4 81.8 Metalization defect 35 2 C3, C2, C1, C4 77.1 Excess of solder 35 3 C2, C3, C1, C4 81.5 Deficit of solder 49 4 C1, C3, C4, C2 83.5 Disoriented 32 Table 3: The best binary classifier priorities and classifica- Table 1: Distribution of test images. tion accuracies of learning modes. Median Threshold Particle Highest Best learning Max. Class window size value size priority mode accuracy [%] Metalization defect 3 54 13 C1 4 83.5 Excess of solder 3 5 2 C2 4 83.2 Deficit of solder 5 78 760 C3 4 83.5 Disoriented 1 81 184 C4 4 83.5 Table 2: Input parameter values for the computer vision al- Table 4: The best learning modes and classification accura- gorithms. cies of binary classifier priorities. 2. all attributes, but only commutators that are either with- to, for example, correctly classify a commutator with irregu- out irregularities or contain irregularities of this particu- larity x1 when the binary classifier for irregularity x2 is used. lar type are used, Such performance is not guaranteed when building the binary classifiers for irregularity xi without taking into account the 3. only attributes of the corresponding ROI, but all commu- images of irregularities xj, i = j (learning modes 1 and 2). tators including irregularities of all types are used, and Consequently, when classifying the commutators with pre- viously unseen irregularities (learning modes 1 and 2), the 4. all attributes and all commutators including irregularities classification accuracy varies significantly with respect to the of all types are used. priority of classifiers as shown in Figure 1. These results also confirm that partitioning the classification problem into four 4 EXPERIMENTS AND RESULTS subproblems, one for each irregularity type, results in higher The proposed method for detecting irregularities was tested classification accuracy, but only if all attributes and commu- on a set of images of commutators without irregularities and tators with all irregularities are used when building the binary the ones containing irregularities. The distribution of the test classifiers (see the classification accuracy of learning mode 4 images among the irregularity classes is shown in Table 1. in comparison to classification accuracy of the single classi- The applied computer vision algorithms were imple- fier in Figure 1). On the other hand, when building the binary mented in Open Computing Language (OpenCL) [3] that is classifiers from the reduced set of attributes or the reduced set suitable for deploying on embedded many-core platforms and of irregularities, the obtained classification accuracy is lower installing in the production environments. More precisely, we than the classification accuracy of the single classifier. Fi- used the OCL programming package [2], which is an imple- nally, these results show that the priority of classifiers influ- mentation of OpenCL functions in the Open Computer Vision ences the classification accuracy. The priority is especially (OpenCV) library [4]. The connected-component labeling al- important when using learning modes 1 and 2. gorithm was implemented based on description from [5]. The The results were further analyzed with respect to various input parameter values of computer vision algorithms were priorities of binary classifiers and learning modes (see Tables determined using a tuning procedure described in [1] and are 3 and 4). For this purpose, the binary classifiers were abbre- shown in Table 2. The classifiers were built using the Weka viated as follows: machine learning environment [7]. In particular, the J48 algo- • C rithm, the Weka’s implementation of the C4.5 algorithm for 1 – the binary classifier for detecting metalization de- fects, building decision trees [6], was used for this purpose. Figure 1 shows the classification accuracies obtained with • C2 – the binary classifier for detecting the excess of sol- the tested classifiers and learning modes. When binary classi- der, fiers are applied, all the permutations of priorities are tested, therefore a distribution of classification accuracy is shown. • C3 – the binary classifier for detecting the deficit of sol- The results indicate that the highest classification accuracy is der, and obtained using learning mode 4, i.e., when the attributes de- scribing all types of irregularities and the images of all com- • C4 – the binary classifier for detecting disoriented com- mutators are used to build the binary classifiers. This enables mutators. 24 Binary classifiers, learning mode 1 Binary classifiers, learning mode 2 Binary classifiers, learning mode 3 Binary classifiers, learning mode 4 Single classifier 0.85 0.8 0.75 Classification accuracy [%] 0.7 0.65 Classifier Figure 1: Classification accuracies of the tested classifiers and learning modes. Table 3 shows the best priorities of binary classifiers and the accuracy. Additional attributes could be extracted from the corresponding classification accuracy for each learning mode. images with machine vision algorithms. It would be also in- This table shows that the most important binary classifier is teresting to compare our results with the results produced by C1 since it has the highest priority in two cases. In addi- the existing methods for detecting irregularities on semiprod- tion, the highest classification accuracy is obtained when this ucts. The ultimate goal of this work is to put the automated classifier has the highest priority. The second most important inspection procedure into regular use on the production line. classifier is C3 since it has the highest priority once and the second-highest priority three times. ACKNOWLEDGEMENT Table 4 shows the best learning mode and the correspond- This work has been partially funded by the ARTEMIS Joint ing classification accuracy when the binary classifiers have Undertaking and the Slovenian Ministry of Economic Devel- the highest priority. These results show that the learning opment and Technology as part of the COPCAMS project mode 4 is the best one irrespectively of the binary classifier (http://copcams.eu) under Grant Agreement number that has the highest priority. Nevertheless, when classifier 332913, and by the Slovenian Research Agency under re- C2 has the highest priority, a lower classification accuracy is search program P2-0209. achieved than in other cases. References 5 CONCLUSIONS [1] V. Koblar, E. Dovgan, and B. Filipič. Tuning of a machine- This paper presented the development of an automated proce- vision-based quality-control procedure for semiproducts in au- dure for visual detection of irregularities on graphite-copper tomotive industry. 2014. Submitted for publication. commutators after the soldering of graphite and copper in the [2] OpenCL module within OpenCV library. http: production process. Four types of irregularities were detected //docs.opencv.org/modules/ocl/doc/ introduction.html. a) with a single classifier and b) by partitioning the prob- [3] OpenCL: The open standard for parallel programming. http: lem into four subproblems, learning the binary classifiers for //www.khronos.org/opencl/. each irregularity type and assigning priorities to the classi- [4] OpenCV: Open source computer vision. http://opencv. fiers. The results show that the highest classification accu- org/. racy is achieved when the binary classifiers are used that are [5] D. P. Playne, K. A. Hawick, and A. Leist. Parallel graph com- trained on the data of all types of irregularities. The results ponent labelling with GPUs and CUDA. Parallel Computing, 36(12):655–678, 2010. also indicate that the priority of classifiers significantly in- [6] J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan fluences the classification accuracy and therefore needs to be Kaufmann Publishers, 1993. taken into account. [7] Weka Machine Learning Project. http://www.cs. In the future work we will test additional machine learn- waikato.ac.nz/ml/weka/index.html. ing algorithms for potential improvement of the classification 25 AN ELDERLY-CARE SYSTEM BASED ON SOUND ANALYSIS Martin Frešer1, Igor Košir2, Violeta Mirchevska1, Mitja Luštrek1 Department of Intelligent Systems, Jožef Stefan Institute, Slovenia1 Smart Com d.o.o., Slovenia2 e-mail: martin.freser@gmail.com, igor.kosir@smart-com.si, {violeta.mircevska, mitja.lustrek}@ijs.si ABSTRACT reducing the costs for elderly care and the burden put on the working-age population. This paper proposes an elderly-care system, which uses a This paper presents an elderly-care system, which uses a single sensing device installed in the user's home, single sensing device installed in the user's home, primarily primarily based on a microphone. We present based on a microphone. A microphone may serve both as a preliminary results on human activity recognition from sensor and as a communication device. As a sensor, a sound data. The recognition is based on 19 types of sound microphone may be used for detecting user's activity (e.g. features, such as spectral centroid, zero crossings, Mel- sleep, eating, opening a door) and consequently reasoning frequency cepstrum coefficients (MFCC) and linear about potential problems related to the user (e.g. the user did predictive coding (LPC). We distinguished between 6 not eat whole day, the user is sleeping much more than usual). classes: sleep, exercise, work, eating, home chores and As a communication device, it allows the user to initiate home leisure. We evaluated the recognition accuracy specific services by simply saying a keyword (e.g. call for using 4 supervised learning algorithms. The highest help). It is also needed for remote user-carer communication. accuracy, obtained using support vector machines, was Elderly care based on microphone has not received a lot of 76%. attention, although technology acceptance studies show that most users would accept to have a microphone for home care 1. INTRODUCTION services. Ziefle et al. [3] performed a user acceptance study Predictions made by the Statistical Office of the European comparing three home-integrated sensor types: microphone, Communities state that the over-65 population in EU28 camera and positioning system. According to this study, the expressed as a percentage of the working-age population microphone (plus speaker) is the most accepted technology, (aged between 15 and 64) will rise from 27% in 2014 to 50% followed by the positioning system, while the camera is in 2060 [1]. This demographic trend puts an immense ranked last. pressure to change current health and care practices, which The paper is organized as follows. In Section 2, we already accounts for around 10 % of EU's GDP spending [2]. describe the system architecture. Activity recognition based Innovative remote care systems are emerging, which motivate on sound analysis is presented in Section 3. Evaluation of the and assist the elderly to stay independent for longer, thus presented approach on real-world recordings follows in Figure 1: An elderly-care system architecture. 26 Section 4. Section 5 concludes the paper and presents future work. 2. SYSTEM ARCHITECTURE Figure 1 presents the architecture of the elderly-care system. Most of today’s commercial elderly-care systems offer a so-called emergency call functionality. The user is wearing a red button using which he/she may call for help in case of emergency. By pressing the red button, the care-system establishes a phone connection to a carer or a call center through a telephone network. We use a private branch exchange network (PBX) for establishing such calls. We extend this functionality with sound analysis in order Figure 2: The activity recognition process. to provide context to the emergency call (e.g. past user Then we extract sound features. Each feature is extracted activity), as well as to provide higher safety – the system in 20 ms long window. This is because sound signal is establishes an emergency call when certain types of sound, constantly changing and in such short window we assume that such as screaming, are detected. In order to do so, a cloud- is not changing statistically much. We still have to have based system is established consisting of 4 main components: enough samples though, so shorter windows are sensor engine, data unit, notification engine and control unit. inappropriate. We can also use window overlap so we lose The sensor engine analyses the sound in the apartment in less information. We used 20% of window overlap. order to detect user’s activity (e.g. eating, sleep) or critical We extracted 19 types of features and we aggregated them sound patterns (e.g. screaming, fire alarm). The output of this in a window of one-minute length. We put together all engine is kept in the data unit. In case of emergency detected recordings that were recorded in one minute, then we through sound analysis, the control unit notifies a carer or a extracted each feature in 20 ms window and we aggregated it specialized call center through the notification engine about using mean and standard deviation, so we got a feature vector, the user who needs help and why automatic emergency call is which represented one minute. We also tried to aggregate for being established. When the carer responds, the control unit one second, but we got worse results. establishes a telephone connection with the user’s apartment Features were: Spectral centroid, Spectral rolloff point, through the PBX network, enabling the carer to hear what is Spectral flux, Compactness, Spectral variability, Root mean happening in the apartment and act accordingly. square, Fraction of low energy, Zero crossings, Strongest beat, Strength of strongest beat, Strongest frequency via FFT (Fast Fourier transform) maximum, MFCC’s (Mel frequency 3. ACTIVITY RECOGNITION BASED ON SOUND cepstrum coefficients) (13 coefficients), Linear predictive DATA coding (LPC) (10 coefficients), Method of moments (5 People can distinguish quite well between some everyday features), Partial based spectral centroid, Partial based activities just by listening to them. For example, if we hear a spectral flux, Peak based spectral smoothness, Area method spoon hit a plate, we can say that the person is probably of moments (10 features) and Area method of moments of eating; if we hear the sound of pressing keyboard buttons, we MFCCs (10 features). Those features were aggregated using can say that the person is either at work or at home and is mean and standard deviation. We also added 10 Area using a computer. We developed a system that automatically moments of Area method of moments of MFCC’s. This sums detects everyday home activities based on sound. up to 136 features. All features are explained in [4]. Figure 2 presents the process of activity recognition from Since we have a lot of features, we use feature selection sound data. Firstly, we gather data using a recording device, algorithms. such as a microphone. When recording, some privacy Finally, we use supervised machine learning techniques to protection should be taken into account (e.g. we could record build classifiers for our data. short sequences of time so we could not be able to recognize spoken words). We propose recording for 5 minutes in the following way: we record 200ms in every second for 1 minute 4. EVALUATION and we do not record remaining 4 minutes. In this section we present an evaluation of our experiment. We gathered recordings from 3 persons in their everyday living with smart phone's microphone. They labeled data with the following activities: "Sleep", "Exercise", "Work", "Eating", "Home - chores" and "Home - leisure". Data was firstly intended for monitoring chronic patients. 27 We split data into training and test set. We were recording Since we had 3 persons, we trained one best-performing each person for 2 weeks and we used the first week as training classifier (SMO) for each person. We got a confusion matrix set and the second week as test set. for each person and we summed all 3 matrices in one. It can We extracted features using open-source library jAudio be seen in Table 1. We can see that activity "Sleep" is almost [4], [5]. flawless. We can also see that "Home - chores" is usually For feature selection and machine learning we used the misclassified as "Home - leisure", which could be a open-source library Weka [6]. We used the feature selection consequence of similar sounds produced in a person's home algorithm ReliefF implemented in Weka on every person. We during various activities. Due to the high number of instances used 4 machine learning algorithms: SMO, J48, labeled as “Work”, we got very good classification of RandomForest and iBK, all with default parameters. We "Work", but there are also many instances misclassified as measured accuracies of all algorithms and then we used the "Work". We must take into account that recorded persons best-performing algorithm and we measured F-measures for worked in the office, so many sounds are similar as in the all the activities. home environment. We can conclude that for different activities, there can be many similar sounds, e.g. when person reads a book at home ("Home - leisure"), there can be silence 78 as if the person took a nap ("Sleep"), so it is very challenging for classifiers to achieve high accuracies. 76 smo 74 Table 1: Summed confusion matrix of all persons. 72 RandomForest 70 j48 g re rk e – res e - 68 leep ercise o tin m o m classified IBK S x a o W o eisu 66 E E H ch H L as 64 198 0 1 0 0 5 Sleep 62 0 48 40 0 0 0 Exercise Figure 3: Average accuracy of all classifiers. 0 18 129 43 20 95 Work 0 As can be seen in Figure 3, the best performing algorithm 0 2 48 50 7 10 Eating was SMO, which produced the highest accuracies on all the 10 6 92 3 66 44 Home - chores tested persons. The average accuracy of SMO is 76 %. In the 7 0 61 5 31 216 Home - leisure second and third place are RandomForest and iBK with the average accuracies of 73 % and 69 % respectively. The worst was j48 with the average accuracy of 68 %. 5. CONCLUSION In Figure 4 we can see the average F-measure per activity This paper presents a system and an approach to human for the best performing algorithm SMO. The best recognized activity recognition based on sound. The approach was tested activity is "Sleep" with the average F-measure of 0.96, on real-life recordings of three persons who annotated their following by "Work" with 0.85. SMO detected "Eating" and activity for 2 weeks. "Exercise" relatively well with the average F-measure of As outlined in Section 4 activity recognition from sound 0.46 and 0.43, respectively. The remaining average F- on 1 minute intervals may be challenging. There may be measures for "Home - leisure" and "Home - chores" were complete silence during different kinds of activities (e.g. 0.38 and 0.26, respectively. sleep, work) or the recording may be dominated by speech. Therefore, it is difficult to achieve high accuracies in such settings. 1 Nevertheless, activity recognition from sound may be used 0,8 for remote elderly-care. If we detect that the user was eating at usual times during the day, even though we do not have 0,6 correct value about the eating period, we may conclude that 0,4 user's state is normal. Having reliable sleep recognition, we 0,2 may detect if the person is waking up during the night or if 0 the period of sleep is lengthening, both of which may indicate a health problem. As future work, we need to record everyday living activities of the elderly, and test the system's capability to detect events that are critical for determining their health state. Figure 4: Average F-measure values of SMO. 28 References [1] Eurostat, Retrieved September 2, 2014, from http://epp.eurostat.ec.europa.eu/tgm/table.do?tab=table &init=1&language=en&pcode=tsdde511&plugin=1. [2] European commission, Horizon 2020, Health, demographic change and well being. Retrieved September 2, 2014, from http://ec.europa.eu/research/participants/portal/doc/call/ h2020/common/1617611-part_8_health_v2.0_en.pdf [3] M. Ziefle, S. Himmel, W. Wilkowska (2011) When your living space knows what you do: Acceptance of Medical Home Monitoring by Different Technologies, Lecture Notes in Computer Science, pp. 607-624. [4] McEnnis, Daniel, Ichiro Fujinaga, Cory McKay, Philippe DePalle. 2005. "JAudio: A feature extraction library". ISMIR. [5] McEnnis, Daniel, Ichiro Fujinaga. 2006. "jAudio: Improvements and additions". ISMIR. [6] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H. Witten (2009); The WEKA Data Mining Software: An Update; SIGKDD Explorations, Volume 11, Issue 1. 29 ARE HUMANS GETTING SMARTER DUE TO AI? Matjaž Gams Department of Intelligent Systems, Jožef Stefan Institute Jamova cesta 39, 1000 Ljubljana, Slovenia e-mail: matjaz.gams@ijs.si ABSTRACT it is not the case in mental tasks. Another pessimistic viewpoint suggests that intelligent civilizations decline after Humans are getting smarter due to use of tools, in history reaching a certain development level (see Figure 1), possibly because of mechanical tools and in recent decades due to because of overpopulation, self-destruction or depletion of information tools. The hypothesis in this paper goes a step natural resources. This would explain why we have not yet further: that we are getting smarter due to use of AI. The detected alien civilizations, though the Drake’s equation [7] thesis is indicated by solutions to three well-known logical indicates that many such civilizations should exist. This paradoxes that have been recently resolved by the author remains an open question. of this paper: the unexpected hanging paradox, the Pinocchio paradox and the blue-eyes paradox. This paper is a bit shorter version of the Informatica paper on the same issue [19]. 1. INTRODUCTION Systematic measurements of the standard broad-spectrum IQ tests improve each decade. According to the Flynn effect [1] that humans are getting smarter and smarter. One theory claims that the increase of human intelligence is related to the use of information tools [2], which often progress exponentially over time [3]. In this paper we go a step further - that artificial intelligence (AI) influences human intelligence in a positive way as other influencing factors. We illustrate the hypothesis in Figure 1. The y axis is logarithmic in the Figure 1: Current and predicted growth of computer and scale. Therefore, the linear growth of computer skills on the human computing capabilities to solve problems. graph corresponds to the exponential nature of Moore’s law [4]. Basic human physical and mental characteristics, such as speed of movement, coordination or speed of human How to indicate that AI helps improve humans think computing, have remained nearly constant in recent decades, better? If we can show that humans can solve logical puzzles as represented by the horizontal line in Figure 1. that they were not able to solve until recently without Our first thesis is that the ability of humans to solve computers, that would be a good indication of humans getting problems increases due to information tools such as smarter on their own. An objection might be that just one computers, mobile devices with advanced software, and AI in solution of one puzzle is far too little to show anything. particular (the bold top line in the Figure 1). Programs such However, an indication it might be – at least to start a debate. as the Google browser may provide the greatest knowledge To demonstrate the idea, we analyze the unexpected source available to humans, thereby representing an extension hanging paradox [8, 9, 10, 11] and shortly mention a couple of our brains as do calculators in the field of arithmetic. of other logical paradoxes. The stronger and more provocative hypothesis that humans are getting smarter on their own due to the AI 2. THE LIAR PARADOX comprehensions. After all, AI is about intelligence. In the AI For an introduction to logical paradoxes we quickly community [5], it is generally accepted that AI progress is investigate the liar paradox, first published in [12]. According increasing and might even enable human civilization to take to [13] it was first formulated by the Greek philosopher a quantitative leap [6]. Eubulides: “A man says that he is lying. Is what he says true Several opposing theories claim that humans actually or false?” This sentence is false when it is true. These days, perform worse on their own, since machines and tools have the paradox is usually presented in the form “This sentence is replaced humans’ need to think on their own. We argue that false.” while this effect may be valid for human physical properties But today it is generally accepted that there is no true related for example to obesity due to lack of physical activity, paradox, since the statement is simply false [14]. The 30 contradiction is of the form “A and not A,” or “It is true and established [9]. This is a paradox about a person’s false.” In other words, if a person always lies by definition, expectations about the timing of a future event that they are then that person is allowed to say only lies. Therefore, such told will occur at some unexpected time [16]. The paradox statements are simply not allowed, which means they are has been described as follows [9]: false. A judge tells a condemned prisoner that he will be hanged We presented the liar paradox to analyze why humans at noon on one weekday in the following week but that the had troubles with it before and why now it is seen as a trivial execution will be a surprise to the prisoner. He will not know case. When faced with the liar paradox for the first time, the day of the hanging until the executioner knocks on his cell humans fall into a loop of true/untrue derivations without door at noon that day. Having reflected on his sentence, the observing that their thinking was already falsified by the prisoner draws the conclusion that he will escape from the declaration of the problem. It seems a valid logical problem, hanging. His reasoning is in several parts. He begins by so humans apply logical reasoning. However, the declaration concluding that the "surprise hanging" can't be on Friday, as of the logical paradox was illogical at the start rendering if he hasn't been hanged by Thursday, there is only one day logical reasoning meaningless. left - and so it won't be a surprise if he's hanged on Friday. In analogy, 1 + 1 = 2, and we all accept this as a true Since the judge's sentence stipulated that the hanging would sentence without any hesitation. Yet, one liter of water and be a surprise to him, he concludes it cannot occur on Friday. one liter of sugar do not combine to form two liters of sugar He then reasons that the surprise hanging cannot be on water. Therefore, using common logic/arithmetic in such a Thursday either, because Friday has already been eliminated task is inappropriate from the start. and if he hasn't been hanged by Wednesday night, the hanging Which AI methods help us better understand such must occur on Thursday, making a Thursday hanging not a paradoxes? The principle and paradox of multiple-knowledge surprise either. By similar reasoning he concludes that the [15] tentatively explain why humans easily resolve such hanging can also not occur on Wednesday, Tuesday or problems as the liar paradox. We use multiple ways of Monday. Joyfully he retires to his cell confident that the thinking not only in parallel, but also with several mental hanging will not occur at all. The next week, the executioner processes interacting together during problem-solving. knocks on the prisoner's door at noon on Wednesday — Different processes propose different solutions, and the best which, despite all the above, was an utter surprise to him. one is selected. The basic difference in multiple-knowledge Everything the judge said came true. viewpoint compared to the classical ones occurs already at the Evidently, the prisoner miscalculated, but how? level of neurons. The classical analogy of a neuron is a simple Logically, the reasoning seems correct. While there have been computing mechanism that produces 0/1 as output. In the many analyses and interpretations of the unexpected hanging multiple viewpoint, each neuron outputs 2N possible paradox, there is no generally accepted solution. The paradox outcomes, which can be demonstrated if N outputs from a is interesting to study because it arouses interest in both single neuron are all connected to N inputs of another neuron. laymen and scientists. Here, we provide a different analysis In summary, the multiple-knowledge principle claims that the based on the viewpoint of cooperating AI agents [5, 16], human computing mechanism at the level of a neuron is contexts and multiple knowledge [15]. It might be the case already much more complex than commonly described, and that similar solutions were presented before, but it seems that even more so at the level of higher mental processes. AI knowledge disregards potential complications and Therefore, humans have no problems computing that one provides a simple solution. apple and one apple are two apples, and one liter of water and First we examine, which events are repeatable and which one liter of sugar is 1.6 liters of liquid and a mass of 2.25 irreversible. The prediction of hanging on one out of five kilograms, since they use multiple thinking. It is only that a possible days is well defined through a real-life empirical fact person who logically encounters the sugar-water merge for of a human life being irreversibly terminated. However, the the first time may claim that it will result in 2 liters of sugar surprise is less clearly defined. If it denotes cognitive surprise, water. However, after an explanation or experiment, humans then the prisoner can be sure that the hanging will take place comprehend the problem and have no future problems of this on the current day. No surprise is assured each new day, even kind. on the first day, so hanging under the given conditions is not Another AI solution at hand uses contexts. In arithmetic, possible. Such an interpretation makes no sense. To avoid the 1 + 1 = 2. In merging liquids and solid materials, 1 + 1 ≠ 2. In prisoner being cognitively certain, the following the first case, the context was arithmetic and in the second modifications are often proposed [9]: case, merging liquids and solid materials. The contexts enable The prisoner will be hanged next week, and the date (of an important insight into the paradoxes such as the the hanging) will not be deductible in advance from the unexpected handing paradox. assumption that the hanging will occur during the week (A). The prisoner will be hanged next week and its date will not be 3. THE UNEXPECTED HANGING PARADOX deducible in advance using this statement as an axiom (B). The unexpected hanging paradox, also known as the hangman Logicians are able to show that statement (B) is self- paradox, the unexpected exam paradox, the surprise test contradictory, indicating that in this interpretation, the judge paradox, or the prediction paradox, yields no consensus on its uttered a self-contradicting statement leading to a paradox. precise nature, so a final correct solution has not yet been Chow [10] presents a potential explanation through 31 epistemological formulations suggesting that the unexpected into contradiction.) Even if the prisoner is allowed to deposit hanging paradox is a more intricate version of Moore’s the one and only coupon on any day in the week, there is no paradox [9]: major difference in terms of explanation in this paper. Again, A suitable analogy can be reached by reducing the length if the prisoner is allowed to deposit the coupon each day of the week to just one day. Then the judge’s sentence anew, this formulation makes no sense. becomes: “You will be hanged tomorrow, but you do not We can further explain the error in the prisoner’s line of know that.” reasoning by assuming that instead of giving his ruling five Now we can apply AI methods to analyze the paradox. days in advance, he gave it on Thursday morning, leaving a First, the judge’s statement is a one-sided contract from an AI two-day opportunity. Since the prisoner could use the single agent viewpoint, defining a way of interacting and pardon (remember: deducible for a one-time event means one cooperating. As with any agreement/contract, it also has some prediction once) and save himself on Friday, he concludes mechanisms defining the consequences if one side violates that Thursday is the only day left and cashes in his only the agreement. Since the judge unilaterally proclaimed the coupon with a 100 percent certain logical explanation on agreement, he can even violate it without any harm to him, Thursday. However, in this case the judge could carry out the whereas the prisoner’s violations are punished according to hanging on Friday. Why? Because the prisoner provided the the judge’s will and corresponding regulations. For example, only 100 percent certain prediction in the form of a single life- if the prisoner harms a warden, the deal is probably off, and saving coupon on Thursday, which means that on Friday he the hanging can occur at the first opportunity, regardless of could not deliver the coupon. In other words, the prisoner whether it is a surprise. This is an introductory indication that wrongly predicted the hanging day and therefore violated the the hanging paradox is from the real world and that it matters, agreement. and is not just logical thinking. Even more important, it The situation on Thursday is similar to the situation on enables a valid conclusion that any error in prisoner’s Monday. Even if the judge knocks on the door on Thursday, actions releases the judge from his promise. and the prisoner correctly predicted Thursday, he still could Since the judge is the interpreter of the agreement, he can not provide a 100 percent certain explanation why the accept the weird viewpoint that it suffices that the prisoners hanging would occur on Thursday since the judge could come claims a surprise to be released. However, the judge is back on Friday as described in the above text; therefore, the supposed to be a smart person and there is no sense in such a judge can proceed also on Thursday or Friday without viewpoint. The judge is also supposed to be an honest person violating his proclamation. and as long as the prisoner abides to the appropriate behavior, the judge will keep his word and presumably postpone the 4. DISCUSSION execution if the prisoner predicts the exact day of the hanging. Wikipedia offers the following statement regarding the Now, we come to the crucial reasonable definition of unexpected hanging paradox [9]: ambiguity, defined by the smart and honest judge. The term There has been considerable debate between the logical deducible now means that the prediction will be 100 percent school, which uses mathematical language, and the guaranteed accurate about a one-time event (that is, hanging), epistemological school, which employs concepts such as so such a prediction can be uttered only once a week, not knowledge, belief and memory, over which formulation is each day anew. Therefore, the prisoner has exactly one correct. chance of not only predicting, but also explaining with According to other publications [8], this statement certainty to the judge, why the hanging will occur on that correctly describes the current state of scientific literature and particular day. The judge will have to be persuaded; that is, the human mind. he will have to understand and accept the prisoner’s line of To some degree, solutions similar to the one presented in reasoning. If not, the deal is off and the judge can choose any this paper have already been published [8, 9]. However, they day while still keeping his word. have not been generally accepted and, in particular, have not For easier understanding, consider that the prisoner is been presented through AI means. Namely, AI enables clearer given a life-saving coupon on which he writes the predicted explanation such as: day and stores it in the judge’s safe on Monday morning with The error in the prisoner’s line of reasoning occurs when the explanation attached. Obviously, the prisoner stands no extending his induction from Friday to Thursday, as noted chance if the judge orders handing on Monday. Namely, if the earlier, but the explanation in this paper differs. The correct prisoner proposes Monday, he cannot provide a deducible conclusion about Friday is not: explanation why the handing will happen on Monday. Yes, he “Hanging on Friday is not possible” (C), but: will not be surprised in cognitive terms, but both a correct “If not hanged till Friday and the single prediction with prediction and a deducible explanation are required in order explanation was not applied for any other day before, then to avoid hanging. The only chance to avoid hanging is to hanging on Friday is not possible.” (D) predict Friday and hope that he will not be hanged till Friday. The first condition in (D) is part of common knowledge. (In this case, the judge could still object that, on Monday for The second condition in (D) comes from common sense about example, the prisoner could not provide a plausible one-sided agreements: every breach of the agreement can explanation for Friday. Yet, that would not be fair since, on cause termination of it. The two conditions reveal why Friday, the prisoner would indeed be sure of the judge coming humans have a much harder time understanding the hanging 32 paradox, compared to the liar paradox. The conditions are References related to the concepts and interpretation of time and [1] Teasdale, T W, Owen, D. R. (2005). A long-term rise and deducibility and should be applied simultaneously, whereas recent decline in intelligence test performance: The Flynn only one insight is needed in the liar paradox. In AI, this Effect in reverse. Personality and Individual Differences, phenomenon is well known as the context-sensitive Volume 39, Issue 4, September 2005, Pages 837–843, reasoning in agents, which was first presented in [18] and has Elsevier. been used extensively in recent years. Here, as in real life, [2] Flynn, J. R. (2009). What Is Intelligence: Beyond the under one context the same line of reasoning can lead to a Flynn Effect. Cambridge, UK: CambridgeUniversity different conclusion compared to the conclusion under Press. another context (remember the sugar water). But one can also [3] Computing laws revisited (2013). Computer 46/12. [4] treat the conditions in statement (D) as logical conditions, in Moore, G.E. (1965). Cramming more components onto which case the context can serve for easier understanding. integrated circuits. Electronics Magazine, 4. The same applies to the author of this paper: Although he has [5] Proceedings of the Twenty-Third International Joint been familiar with the hanging paradox for decades, the Conference on Artificial Intelligence (IJCAI’13) (2013). solution at hand emerged only when the insight related to the Beijing, China. contexts appeared. [6] Kurzweil, R. (2005). The Singularity is Near. New Returning to the motivation for analysis of the York: Viking Books. unexpected hanging paradox, the example was intended to [7] Dean, T. (2009). A review of the Drake equation. show that humans have mentally progressed to see the trick Cosmos Magazine. in the hanging paradox, similar to how people became too [8] Wolfram A. (2014). http://mathworld. smart to be deceived by the liar paradox. wolfram.com/UnexpectedHangingParadox.html This new approach has also been used to solve several [9] Unexpected hanging paradox, Wikipedia (2014). other paradoxes, such as the blue-eyes paradox and the https://en.wikipedia.org/w/index.php?title=Unexpected_han Pinocchio paradox. Analyses of these paradoxes are being ging_paradox&oldid=611543144, June 2014 submitted to other journals. [10] Chow, T.Y. (1998). The surprise examination or In summary, the explanation of the hanging paradox and unexpected hanging paradox. American Mathematical the difficulty for human paradox solvers resembles those of Monthly 105:41–51. the liar paradox before solving it beyond doubt. It turns out [11] Sober, E. (1998). To give a surprise exam, use game that both paradoxes are not truly paradoxical; instead, they theory. Synthese 115:355–73. describe a logical problem in a way that a human using logical [12] O’Connor, D.J. (1948). Pragmatic paradoxes. Mind 57: methods cannot resolve the problem. Similar to the untrue 358–9. assumption that a liar can utter a true statement, the [13] Beall, J.C., Glanzberg, M. (2013). In Edward N. unexpected hanging paradox in the prisoner’s line of Zalta, E.N. (eds.), The Stanford Encyclopedia of Philosophy. reasoning exploits two misconceptions. The first is that a 100 [14] Prior, A.N. (1976). Papers in Logic and Ethics. percent accurate prediction for a single event can be uttered Duckworth. more than once (through a vague definition of “surprise”) and [15] Gams, M. (2001). Weak Intelligence: Through the the second that a conclusion that is valid at one time is also Principle and Paradox of Multiple Knowledge. New York: valid during another time span. Due to the simplicity of the Nova Science Publishers, Inc. AI-based explanation in this paper, there is no need to provide [16] Sorensen, R. A. (1988). Blindspots. Oxford, UK: additional logical, epistemological, or philosophical Clarendon Press. mechanisms to explain the failure of the prisoner’s line of [17] Young, H.P. (2007). The possible and the impossible in reasoning. multi-agent learning. Artificial Intelligence 171/7. This paper provides an AI-based explanation of the [18] Turner, R.M. (1993). Context-sensitive Reasoning for hanging paradox for humans in natural language, while Autonomous Agents and Cooperative Distributed Problem formal explanations remain a research challenge. The formal Solving, In Proceedings of the IJCAI Workshop on Using analysis have already been designed for the Pinocchio Knowledge in its Context, Chambery, France. paradox whereas the blue-eyed paradox has not yet been [19] Gams, M (2014). The Unexpected Hanging Paradox formally explained, only in a way similar to this paper. from an AI Viewpoint, Informatica 38, 181–185. Acknowledgements The author wishes to thank several members of the Department of Intelligent Systems, particularly Boštjan Kaluža, Mitja Luštrek, and Tone Gradišek for their valuable remarks. Special thanks are also due to Angelo Montanari, Stephen Muggleton, and Eva Černčic for contributions on this and other logic problems. 33 DEVELOPING A SENSOR FIRMWARE APPLICATION FOR REAL-LIFE USAGE Hristijan Gjoreski, Mitja Luštrek, Matjaž Gams Department of Intelligent Systems, Jožef Stefan Institute, Jožef Stefan International Postgraduate School, e-mail: {hristijan.gjoreski, mitja.lustrek, matjaz.gams}@ijs.si ABSTRACT firmware TinyOS application was developed in order to satisfy the user’s requirements and the sensors limitations. It In recent years the demand for intelligent systems that implements two modes of operation: real-time data sending support the life of the elderly is increasing. In order to and logging the data in the internal memory and sending it provide an appropriate support, these systems should offline in batches. constantly monitor the user with sensors. However, The motivation and the context of the study is the using sensors in real-life situations is a challenging task, CHIRON project (Cyclic and person-centric health mainly because of the constraints in the sensor energy management: Integrated approach for home, mobile and consumption (battery life) and memory capacity for clinical environments) [4]. It is a European research project storing the sensors data. In this paper we present an of the ARTEMIS JU Program with 27 project partners. It example of a sensor communication protocol developed includes industry partners (large companies and SMEs), for the Shimmer accelerometers, so that they can be research and the academic institutions, and also medical used in in real-life situations, i.e., constantly monitoring institutions. The project addresses one of the today’s the user during a normal day. A custom firmware application is developed, which has several societal challenges i.e., “effective and affordable healthcare functionalities: real-time data streaming through and wellbeing”. CHIRON combines state-of-the art Bluetooth, data logging into internal microSD card, technologies and innovative solutions into an integrated sending the stored data to a Bluetooth-enabled device framework of embedded systems for effective and person- and detecting when the sensor is put on and off a centered health management throughout the complete charging dock. healthcare cycle, from primary prevention (person still healthy) to secondary prevention (initial symptoms or 1 INTRODUCTION discomfort) and tertiary prevention (disease diagnosis, treatment and rehabilitation) in various domains: at home, The world’s population is aging rapidly, threatening to in nomadic environments and in clinical settings. overwhelm the society’s capacity to take care of its elderly members. The percentage of persons aged 65 or over in 2 SENSORS developed countries is projected to rise from 7.5% in 2009 to 16% in 2050 [1]. This is driving the development of In the CHIRON project, two Shimmer accelerometers are innovative ambient assisted living (AAL) technologies to used to monitor the user’s activities. The sensor platform is help the elderly live independently for longer and with based on the Shimmer Wireless Sensor Network (WSN) minimal support from the working-age population [2][3]. module. It is based on a T.I. MSP430F1611 To provide timely and appropriate assistance, AAL systems microcontroller, which operates at a maximum frequency of must monitor the user by using ambient (environmental) 8 MHz and is equipped with 10Kb RAM and 48 Kb of and/or wearable sensors. With the recent development of Flash. Wireless communication is achieved either with the sensors technology, the wearable sensors are gaining Bluetooth v2 (BT − RN-42 module) or through IEEE attraction and can measure lot of different user-related 802.15.4 (T.I. CC2420 module.). In our study we used the parameters: location, activity, physiological, etc. Examples standard BT v2 in order to easily connect it with a of them include: GPS, accelerometers, gyroscopes, heart- smartphone. For storage purposes, the Shimmer platform is rate sensors, breath-rate sensors, etc. equipped with an integrated 2GB microSD card, which is In order to be used in everyday life situations, these used in normal operation mode to store sensor readings [5]. sensors have to able to constantly monitor the user during The power supply is comprised of a 450mAh rechargeable the day. However, often this is a challenging task mostly Li-ion battery. due to battery consumption constraints and memory storage The firmware of the Shimmer platform is based on the capacity. In this study we present a sensing protocol for the open-source TinyOS operating system [6]. It uses the NesC Shimmer accelerometer sensors, so they can be used to programming language, which is a light-weight version of constantly monitor the user during the day. A custom C. TinyOS/NesC is dedicated for low-power wireless 34 sensors and allows many sensor platforms with a between the smartphone and the sensors, which is highly heterogeneous set of hardware devices to be programmed unlikely in real-life scenario. In order to achieve these and controlled (microcontroller, sensors, SD cards, etc.). functionalities we created a custom firmware application The TinyOS in the Shimmer sensors follows three-layer which has two modes of operations: real-time data sending abstraction architecture. At the bottom is the Hardware and logging the data in microSD card and sending it for Presentation Layer (HPL) which allows access to offline analysis. The application is based on the two input/output pins or registers of the hardware devices. Next, standard Shimmer firmware applications, which are publicly the Hardware Abstraction Layer (HAL) allows configuring available: real-time data acquisition ("BoilerPlate.ihex") more complex functionality in order to communicate with and data logging ("JustFATLogging.ihex"). external sensors or resources implemented in the platform. The original logging application (JustFATLogging.ihex) The top layer is the Hardware Independent Layer (HIL), has one main function, to log the acceleration data on the which permits to read the sensor data independently of the microSD card. The start of the logging is triggered when the digital communication bus. sensor is removed from a dock station and the end of Each layer communicates with the adjacent ones trough logging is triggered when the sensor is put back on the dock interfaces, either generic or hardware specific. As the station. The data can be accessed only through the USB port TinyOS is an event-driven operating system, the interface of the docking station. We used this firmware application as call commands that are addressed to the lower layer. These a base for further development. commands are answered from the lower layers by signaling First, we added the Bluetooth functionality in order to events. In our case, the Shimmer sensor platform, the HPL allow wireless communication between the sensor and the and the HAL layers are already available and for its internal smartphone. However, the activation of the Bluetooth resources, as the accelerometer, the SD card or the significantly decreased the sensor's battery life. Therefore, Bluetooth radio, the HIL layer is also implemented. we modified the application so the Bluetooth is activated Once the layers are implemented, a firmware application only when the sensor is put back on the charger. During the is developed. In our case, the application is based on the charging time, the smartphone sends a command and specifications (sensing protocol) provided by doctors in the collects the logged data. When the user decides to mount CHIRON project. The firmware application and the sensing the sensors he/she gives a command to start logging and to protocol are discussed in the next section. turns off the Bluetooth. Thus, during the logging process the Bluetooth is off and there is no communication between the 3 SENSING PROTOCOL AND FIRMWARE smartphone and the sensor. APPLICATION For the acceleration data, it is really important how the sensor is mounted, i.e., the sensor orientation must be the In order to explain the sensing protocol (shown in Figure 1), same for every recording. In order to check the orientation let us consider the following scenario. The user wakes up of the sensors, an algorithm analyzes the data during some and takes the two accelerometers from the charging dock. predefined activities, e.g., standing and lying, and Once they are taken out from the dock, the sensors have to accordingly gives a notification to the user if the sensors are start sensing. The user attaches the sensors in the wearable mounted in the correct way. To allow this data analysis, the garment (e.g., chest and thigh elastic straps) and performs data has to be analyzed in real-time, therefore we added an initialization activity sequence. This sequence is also this functionality, i.e., real-time transmission. The real- performed in order to ensure if the sensors have the right time data acquisition is performed before the start logging orientation (important for the post-processing of the data). command is sent. The orientation checking lasts for a few minutes, during In addition, two more functionalities were implemented: which the sensor data is streamed in real-time to a deleting a log file ( delete log), and checking the availability smartphone application. Once the smartphone confirms that of a log file ( is log available). the setup is all right, the user continues with his everyday The final modification is related to the timestamps of the activities. During this period the sensors log the data locally data samples and data synchronization between different to a microSD card. At the end of the day, the user takes out sensors. In order to synchronize the data between the the sensors, puts them to the charging docks, and goes to sensors, one must know the absolute timestamp of the data sleep. During the sleep, the sensors are charged and all the samples. In our case, we used the timestamp of the start of data is transferred to the processing unit. the logging and the time difference between consecutive The scenario shows that the battery life of the sensors data samples. The sensor's internal crystal clock is used for should last at least 16 hours (the active period of a normal estimating the time difference between consecutive data day) and the sensors should be able to receive commands samples. Thus, each data sample is labeled with a timestamp from a smartphone through Bluetooth. Our tests showed that provided by the clock. The starting timestamp is sent by the if the standard firmware application is used (real-time data smartphone with each start logging command. Using the sending using BT), the battery will last around 6 hours, starting timestamp and the internal counter's timestamps, the which is not sufficient for the whole day. Furthermore, with smartphone was able to reconstruct an absolute timestamp this approach there should be a constant BT connection for each data sample. 35 Sensor manual Docking event restart (the sensor is put on the charger) Sensor events Sensor commands Turn the Bluetooth on Yellow LED on Ready state Start Turn off (Bluetooth is on, sensor can accept commands) logging all LEDs Real-time Send log transmission Is log available Delete log Receive absolute timestamp from the Red LED on Check log Check log Check log smartphone Turn the Start sampling Send the status Is log Is log No Bluetooth off Timer through BT available? available? Green LED off Yes Create new log file Write Acceleration Yes on the SD card data to SD card Timer event Delete log Send the logged data through BT Start sampling Timer Green LED on Sample Acceleration Green LED on data Yes Timer event Send Acceleration Sample Acceleration Time to log sample through BT data Acceleration data? No Figure 1. Sensor Firmware Flowchart. In theory, the internal Shimmer crystal clock (Epson FC- sending, and waits for the absolute 135 32.7680KA-A3) tolerance is ±20 ppm, which results in real timestamp from the smartphone. 1.8 seconds maximal drift in 24 hours. In the worst case This timestamp is written at the scenario, when two sensors have different drift direction (+ beginning of each log file and is used or -), the time difference is 3.6 seconds, which is acceptable as a reference point for reconstructing for the project's requirements. Several practical tests were the timestamp for each data sample. performed and confirmed the theoretical analysis, i.e., the After that, the sensor starts logging measured drift was in the range of 1 second for a whole day the accelerometer data. Additionally, recording. the Bluetooth is turned off; there is no communication between the sensor 4 SENSOR COMMANDS AND EVENTS and the smartphone during the logging process. Table 1 describes the commands that can be received by the The sensors checks if the log file is sensor application firmware. These commands can be sent Is log available for sending and sends the by any BT-equipped device. available status. Table 1. Sensor commands. The sensor sends the logged file. First, the absolute timestamp is sent, and Send log The sensor samples accelerometer then the accelerometer data samples Real-time data at 50Hz and sends each sample to are sent. transmission the smartphone. Delete log The sensor deletes the log file. Start logging The sensor stops the Real-time 36 Table 2 describes the events that are detected by the Table 4. Average working time for the real-time (Bluetooth application firmware. is active) and the logging mode (Bluetooth is not active; the sensor is logging in the SD card). Table 2. Events that can be detected by the sensor’s application firmware Sampling Real-time mode SD-logging Frequency (Hz) mode Docking event The sensor stops the logging and 50 6h 30m 14 days (the sensor is turns on the Bluetooth. Yellow LED put on the is turned on, representing that the log 6 CONCLUSION charging dock) file is ready to be sent. The sensor restarts to the initial state. In this paper we showed how one can overcome the sensor That is, stops the logging and turns limitations (battery life and memory storage) by creating a Sensor manual on the Bluetooth. Yellow LED is custom firmware application and adjusting it to real-life restart turned on, representing that the log situations. We presented a sensing protocol and sensor file is ready to be sent. firmware application developed for the Shimmer accelerometers. The protocol was created so that the sensors 5 THEORETICAL AND EMPIRICAL TESTS can be used in in real-life situations, i.e., constantly monitoring the user during a normal day. The developed After developing the firmware application we performed custom firmware application has several functionalities: real- several theoretical and empirical tests. First, we analyzed time data streaming through Bluetooth, data logging into the amount of data expected to be generated on a daily internal microSD card, sending the stored data to a basis. The MSP430 A/D channels perform 12-bit (2 bytes Bluetooth-enabled device, and detecting when the sensor is of storage) digitization and that a 16 bit (2 bytes) timestamp put on and off a charging dock. is stored for each sample. Table 2 summarizes our projections based on the sampling frequency of every Acknowledgement sensor. Based on these calculations the total amount of data for 12-16h of daily use should be 114Mb – 152.4Mb. This This work was partly supported by the Slovene Human amount of data does not pose any issue in any operational Resources Development and Scholarship funds and partly by mode, since in the real-time scenario BT can achieve data the CHIRON project - ARTEMIS Joint Undertaking, under rates up to 300kbps which is more than adequate for the grant agreement No. 2009-1-100228. amount of data generated per second and in the logging operating mode the microSD cards on the modules have References more than enough capacity to store the generated data. [1] United Nations 2009, World population ageing, Report Table 3. The amount of data generated by the accelerometer. [2] A. Bourouis, M. Feham, A. Bouchachia, "A new Sampling Data per second Data per hour architecture of a ubiquitous health monitoring system: a prototype of cloud mobile health monitoring system," Frequency (Hz) (KB/s) (MB/h) The Computing Research Repository, 2012. 50 0.78 2.74 [3] M. Luštrek, B. Kaluža, B. Cvetković, E. Dovgan, H. Gjoreski, V. Mirchevska, M. Gams, "Confidence: ubiquitous care system to support independent living" The energy consumption analysis of the Shimmer DEMO at European Conference on Artificial platform, presented by Burns et. al. [5], designates that the Intelligence, pp. 1013-1014, 2012. accelerometer draws 1.6mA when sampled at 50Hz. When [4] The CHIRON project: http://www.chiron-project.eu/ the sensor streams accelerometer data in real-time through the BT the consumption increases to 5.2 mA. From this [5] A. Burns, B. R. Greene, M. J. McGrath, T. J. O'Shea, B. Kuris, S. M. Ayer, F. Stroiescu, V. Cionca, analysis, it is safe to assume, that since the same hardware "SHIMMER™ – A Wireless Sensor Platform for equipped with a 450mAh battery is used the clinical Noninvasive Biomedical Research," IEEE Sens. J., requirement of 6-8h data logging (in the storing mode) or an vol.10, no.9, pp.1527- 1534, 2010. adequate amount of time (around 1h) for live streaming [6] P. Levis, S. Madden, J. Polastre, R. Szewczyk, K. (streaming mode) can be easily met, provided that the Whitehouse, A. Woo, D. Gay, J. Hill, M. Welsh, E. Brewer, D. Culler. TinyOS: An operating system for module’s batteries are fully charged at the beginning of sensor networks. In Werner Weber, JanM Rabaey, and sensing. Table 4 lists the average battery lifetime (full Emile Aarts, editors, Ambient Intelligence, chapter 7, battery drainage period) obtained for the two modes of pp. 115–148, 2005. operation from a series of experiments. 37 AUTOMATIC RECOGNITION OF EMOTIONS FROM SPEECH Martin Gjoreski 1 , Hristijan Gjoreski 2 , Andrea Kulakov 1 1Faculty of Computer Science and Engineering, Rugjer Boshkovikj 16, 1000 Skopje, Macedonia; 2Department of Intelligent Systems, Jožef Stefan Institute, Jamova cesta 39, 1000 Ljubljana, Slovenia e-mail: martin.gjoreski@gmail.com, hristijan.gjoreski@ijs.si, andrea.kulakov@finki.ukim.mk ABSTRACT approaches the voice and the spoken words are analyzed [2]. Some are focused only on the facial expressions [3]. Some This paper presents an approach to recognition of human emotions from speech. Seven emotions are are analyzing the reactions in the human brain for different recognized: anger, fear, sadness, happiness, boredom, emotional states [4]. Also there are combined approaches disgust and neutral. The approach is applied on a where combination of the mentioned approaches is used [5]. speech database, which consists of simulated and In studies where human emotions are analyzed mainly two annotated utterances. First, numerical features are methodologies are used. In the first methodology the extracted from the sound database by using audio emotions are viewed as discrete and completely distinct feature extractor. Next, the extracted features are classes that are universally recognized [6]. In the second standardized. Then, feature selection methods are used methodology the emotional states are represented in 2D or to select the most relevant features. Finally, a 3D space where parameters like emotional distance, level of classification model is trained to recognize the emotions. activeness, level of dominance and level of pleasure can be Three classification algorithms are tested, with SVM observed [7]. In this research the discrete methodology will yielding the highest accuracy of 89% and 82% using the be used, so the emotional states will be represented as 7 10 fold cross-validation and Leave-One-Speaker-Out classes: anger, fear, sadness, happiness, boredom, disgust techniques, respectively. “Sadness” is the emotion which and neutral. is recognized with highest accuracy. The remainder of this paper is organized as follows. Next section is a brief overview of speech emotion analysis. Then, 1 INTRODUCTION the methodology used for the process of emotion classification is presented. In the next section, the Human capabilities for perception, adaptation and learning experimental setup and the results are presented. Finally, the about the surroundings are often three main compounds of conclusion and a brief discussion about the results is given. the definition about what intelligent behavior is. In the last few decades there are many studies suggesting that one very 2 SPEECH EMOTION ANALYSIS important compound is left out of this definition about intelligent behavior. That compound is emotional Speech emotion analysis refers to usage of methods to intelligence. Emotional intelligence is the ability of one to extract vocal cues from speech as a marker for emotional feel, express, regulate his own, to recognize and handle the state, mood or stress. The main assumption is that there are emotional state of others. In psychology the emotional state objectively measureable cues that can be used for predicting is defined as complex state that results in psychological and the emotional state of the speaker. This assumption is quite physiological changes that influence our behaving and reasonable since the emotional states arouse physiological thinking [1]. reactions that affect the process of speech production. For With the recent advancements of the technology and the example, the emotional state of fear usually initiates rapid growing research areas like machine learning, audio heartbeat, rapid breathing, sweating and muscle tension. As processing and speech processing, the emotional states will a result of these physiological activities there are changes in be inevitable part of the human-computer interaction. There the vibration of the vocal folds and the shape of the vocal are more and more studies that are working on providing the tract. All of this affects the vocal characteristics of the computers with abilities like recognizing, interpretation and speech which allows to the listener to recognize the simulation of emotional states. emotional state that the speaker is experiencing [8]. The In this research we present an approach for automatic basic speech audio features that are used for speech emotion recognition of emotions from speech. The goal is to recognition are: fundamental frequency (human perception recognize the emotional state that is experiencing the for fundamental frequency is pith), power, intensity (human speaker. Furthermore, the focus is on how something is said, perception for intensity is loudness), duration features (ex. and not what is said. Besides this approach where only the rate of speaking) and vocal perturbations. The main speaker’s voice is analyzed, there are several different question is: Are there any objective voice feature profiles approaches for recognizing the emotional state. In some that can be used for speaker emotion recognition? A lot 38 studies are done for the sake of providing such feature the expressed emotions. Also the low quality of the audio profiles that can be used for representation of the emotions, can be problem. but not always the results are consistent. For some basic For this research the Berlin emotional speech database [12] problems like distinguishing normal speech from angry is used. It consists of 535 audio files, where 10 actors (5 speech or distinguishing normal speech from bored speech male and 5 female) are pronouncing 10 sentences (5 short the experimental results converge [9]. The problem arises and 5 long). The sentences are chosen so that all 7 emotions when we have to distinguish emotional states like anger that we are analyzing can be expressed. The database is from happiness or fear from happiness. By using the basic additionally checked for naturalness by a human expert. The speech audio features for describing these emotional states, utterances that were rated with more than 60% naturalness the feature profiles will be quite similar so distinguishing and from which the expressed emotion was recognized with them is hard. more than 80%, were included in the final database. In the last few years, new method is introduced where static feature vectors are obtained by using so called acoustic 3.2 Feature Preparation Low-Level Descriptors (LLDs) and descriptive statistical The feature extractor tool used in this research is functionals [10]. By using this approach a big number of openSMILE (Open Speech and Music Interpretation by large feature vectors is obtained. The downside is that not Large Space Extraction) [13]. It is a tool for signal all of the feature vectors are of good value, especially not processing and machine learning. We extracted 1582 for emotion recognition. For that reason a feature selection features in total [14]. The LLDs that openSMILE is using method is often used. are computed from basic features (pitch, loudness, voice quality) or representations of the audio signal (cepstrum, 3 THE APPROACH linear predictive coding). Figure 1 shows the whole process of the speech emotion On these LLDs functionals are applied and static feature classification used in this research. An emotional speech vectors are produced, therefore static classifiers can be database is used, which consists of simulated and annotated used. The functionals that are applied are: extremes utterances. Next, feature extraction is performed by using (position of mix/min value), statistical moments (first to open source feature extractor. Then, the extracted features forth), percentiles (ex. the first quartile), duration (ex. are standardized. After standardization, feature selection percentage of time the signal is above threshold) and methods are used for decreasing the number of features and regression (ex. the offset of a linear approximation of the selecting only the most relevant ones. Finally, the emotion contour). recognition is performed by a classification model. After the feature extraction the feature vectors are standardized so the distribution of the values of the feature Feature Preparation vectors is with mean equal to 0 and standard deviation equal Emotional Feature Feature Feature to 1. Next, a method for feature selection is used. Features Database Extraction Standardization Selection are ranked with algorithms for feature ranking and experiments are done with varying number of top ranked Emotion Emotion features. For ranking the features two different algorithms Classification are used, gain ratio [15] and ReliefF [16]. Both algorithms Figure 1: Scheme for speech emotion classification. are used as they are implemented in Orange software packet for machine learning and data mining [17]. 3.1 Emotional Database There are several emotional speech databases that are 3.3 Emotion Classification extensively used in the literature [11]: German, English, Japanese, Spanish, Chinese, Russian, Dutch etc. One of the Once the features are extracted, selected and standardized, main characteristics of an emotional speech database is the they are used to form the feature vector database. That is a type of the speech: whether it is simulated or it is extracted database in which each data sample is an instance, i.e., from real life situations. The advantage of having a feature vector. Additionally, each instance is labeled with the simulated speech is that the researcher has a complete emotion. After this the instances are used to train a control over the emotion that it is expressed and complete classification model in order to recognize emotions out of a control over the quality of the audio. However, the speech data. disadvantage is that there is loss in the level of naturalness and spontaneity. On the other hand, the non-simulated 4 EXPERIMENTS emotional databases consist of a speech that is extracted Three types of experiments are performed. In the first type, from real life scenarios like call-centers, interviews, tests for comparison of three classification algorithms are meetings, movies, short videos and similar situations where done. The algorithm with the highest accuracy is further the naturalness and spontaneity is kept. The disadvantage is evaluated with 2 evaluation techniques: 10 fold cross- that in these databases there is not a complete control over 39 validation and Leave-One-Speaker-Out (LOSO) cross- Figure 3: SVM classification accuracy for 10 fold cross- validation. validation with varying number of features. 4.1 Comparison of Classification Algorithms Three classification algorithms are compared: KNN [18], Additional analysis of the performance is performed by SVM [19] and Naïve Bayes [20]. They are used as analyzing the recognition results for each emotion implemented in the Orange machine learning toolkit. The individually. The results achieved for the top ranked 750 data is split 70-30, i.e., 70% of the data is used as training, features are shown in Figure 4. The highest accuracy per and the remaining 30% is used for testing. Tests are class is achieved for the class “sadness” (97%). On the performed with varying number (50, 100, 200, 300, 400, contrary, the lowest accuracy per class is achieved for the class “happiness” 500, 750, 1000 and 1582) of top ranked features by gain (68%). ratio. The results (shown in Figure 2) show that the SVM has the highest accuracy, i.e., 91% when the top ranked 500 Accuracy per class value in % features are used. By using the top ranked 300 features the 97 100 drops to 88%. 90 89 91 90 86 86 87 80 SVM KNN Naïve Bayes 68 Accuracy in % 70 95 91 60 88 88 88 90 87 Anger Boredom Disgust Fear Happiness Neutral Sadness AVG 84 85 80 80 75 Figure 4: SVM accuracy per class for 10 fold cross- 75 validation with top ranked 750 features. 70 65 65 60 4.3 Leave-One-Speaker-Out Cross-Validation 50 100 200 300 400 500 750 1000 1582 If the system for speech emotion recognition is supposed to Number of features work in an environment where it does not have any Figure 2: SVM, KNN and Naïve Bayes classification information about the speaker, LOSO is the best approach accuracy for varying number of features. for testing the accuracy of the system. The LOSO validation approach means that the train data 4.2 10 Fold Cross-Validation consists of 9 speakers and the remaining one is used for We further evaluated the SVM with the 10 fold cross- testing. This is repeated 10 times, each time using different validation technique. The results are shown in Figure 3. The speaker’s data for testing. Figure 5 shows the results that are highest accuracy of 89% is obtained by using top ranked 750 obtained with the LOSO technique. The testing speaker is features. By using the top ranked 300 features the average represented on the x-axis. The varying color represents the accuracy is 87%, which is significantly high performance number of top ranked features (by ReliefF) used. The highest with such a low number of features. average accuracy of 82% is obtained by using top ranked 1000 features. Also we can see that the accuracy depends Accuracy in % mainly from the speaker that is used as test data. 95 87 86 86 89 88 88 For the experiments about the accuracy per class for each of 85 the 7 emotional states, top ranked 1000 features (by ReliefF) 75 76 83 69 are used. The results are shown in Figure 6. The highest 65 50 100 200 300 400 500 750 1000 1582 accuracy per class of 94% was achieved for the class “sadness” and the lowest accuracy per class of 70% was Number of features achieved for the class “fear”. Accuracy in % Number of features: 300 500 750 1000 1582 100 90 80 70 60 S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 AVG Figure 5: SVM classification accuracy for LOSO with varying number of features 40 Acc A uracy y per cl ass v s al v ue in % [4] R. Horlings, D. Datcu, L. J. M. Rothkrantz. Emotion 100 recognition using brain activity. Proceeding 94 CompSysTech '08 Proceedings of the 9th International 90 85 85 82 Conference on Computer Systems and Technologies and 80 81 78 80 Workshop for PhD Students in Computing, 2008. 70 70 [5] A. Metallinou, S. Lee, S. Narayanan. Audio-Visual Emotion Recognition Using Gaussian Mixture Models 60 Anger Boredom Disgust Fear Happiness Neutral Sadness Average for Face and Voice. Multimedia. 2008. ISM 2008. IEEE Figure 6: SVM accuracy per class value for Leave-One- International Symposium on Multimedia, 2008. Speaker-Out cross-validation with top ranked 1000 [6] P. Ekman. Emotions in the Human Faces. 1982. features. [7] James A. Russell. A circumplex model of affect. 1980. [8] P. N. Juslin, K. R. Scherer. Vocal expression of affect. 5 CONCLUSION In J. A. Harrigan, R. Rosenthal, & K. R. Scherer (Eds.). The new handbook of methods in nonverbal behavior The results showed that SVM outperforms the KNN and research, pp. 65-135, 2004. Naïve Bayes. By using the top ranked 500 features by gain [9] K. R. Scherer. Vocal communication of emotion: A ratio, SVM achieved the highest accuracy of 91%. review of research paradigms. Speech Communication In addition, the 10 fold cross-validation of the SVM showed 40: 227–256. 2003 that highest accuracy of 89% was achieved by using the top [10] M. E. Mena. Emotion Recognition From Speech 750 ranked features. By using the top 300 ranked features Signals, 2012. the accuracy was 87%. This is the so-called “knee” on the [11] D. Ververidis, C. Kotropoulos. A review of emotional graph, which represents the best tradeoff between the speech databases. In: PCI 2003. 9th Panhellenic number of features and the achieved performance. Conference on Informatics., pp. 560–574, 2003. Regarding the accuracy for each of the 7 emotions, [12] F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, experiments were performed with the top ranked 750 B. Weiss. A Database of German Emotional Speech. features by gain ratio. The best recognized emotion was the 2005. In: Proc. Interspeech. pp. 1517–1520. “sadness”, with 97%; and the worst recognized emotion was [13] F. Eyben, M. Wöllmer, B. Schuller. openSMILE - The the “happiness” with 68% accuracy. Munich Versatile and Fast Open-Source Audio Feature With LOSO cross-validation, the SVM achieved highest Extractor. 2010. accuracy of 82% by using the top 1000 ranked features. By [14] F. Eyben, F. Weninger, M. Wollmer, Bjorn Schuller. using the top 500 ranked features the accuracy was 80%. openSmile Documentation. Version 2.0.0., 2013. Regarding the accuracy per emotion, experiments were [15] H. Deng, G. Runger, E. Tuv. Bias of importance performed with the top ranked 1000 features. The highest measures for multi-valued attributes and solutions. accuracy per class (emotion) of 94% was achieved for the Proceedings of the 21st International Conference on class “sadness” and the lowest for the class “fear” 70%. Artificial Neural Networks (ICANN2011). 2011 The results showed that the classifier achieves better [16] I. Kononenko, E. Simec, M. Robnik-Sikonja. accuracy with the 10 fold cross-validation technique Overcoming the myopia of inductive learning compared to the LOSO validation technique. The reason for algorithms with RELIEFF. Applied Intelligence, this is that with the 10 fold cross-validation the training and Forthcoming. the testing data usually contain data samples of the same [17] J. Demšar, B. Zupan. Orange: From experimental speaker. This is not the case if the system is intended to be machine learning to interactive data mining. White used in real life for users not known in advance. However, a Paper (http://www.ailab.si/orange). Faculty of Computer hybrid approach that includes a calibration phase at the and Information Science. University of Ljubljana. beginning (for example asking the user to record several data [18] D. Aha, D. Kibler. Instance-based learning algorithms. samples) is considered for future work. 1991, Machine Learning. 6:37-66. [19] N. Cristianini, J. Shawe-Taylor. An Introduction to References Support Vector Machines and other kernel-based learning methods. Cambridge University Press, 2000. [1] D. G. Myers. Theories of Emotion. Psychology: Seventh [20] R. Stuart, N. Peter. Artificial Intelligence: A Modern Edition. New York NY: Worth Publishers. 2004. Approach. Second Edition, Prentice Hall. [2] V. Perez-Rosas, R. Mihalcea. Sentiment Analysis of Online Spoken Reviews. Interspeech, 2013. [3] A. Halder, A. Konar, R. Mandal, A. Chakraborty. General and Interval Type-2 Fuzzy Face-Space Approach to Emotion Recognition. IEEE Transactions on Systems, Man, and Cybernetics,43 (3), 2013. 41 QUALCOMM TRICORDER XPRIZE FINAL ROUND: A REVIEW Anton Gradišek, Maja Somrak, Mitja Luštrek, Matjaž Gams Department of Intelligent Systems and Solid State Physics Department Jožef Stefan Institute Jamova cesta 39, 1000 Ljubljana, Slovenia Tel: +386 1 4773967 e-mail: anton.gradisek@ijs.si ABSTRACT conditions, independent of professional health care personnel. Being able to diagnose common medical The Qualcomm Tricorder XPRIZE competition began conditions at home benefits both the users by directing them in January 2012, with the goal of developing a mobile to see the doctor if needed and the healthcare system itself, device to monitor health parameters and quickly by reducing the costs and waiting times at medical centers. diagnose several common medical conditions. In August To be precise, there are two criteria in the competition: i) to 2014, a list of ten finalists was announced, including a continuously monitor key health metrics (blood pressure, Slovenian team MESI Simplifying diagnostics that respiratory rate, heart rate, temperature and the oxygen brings together companies MESI, D· Labs, and saturation – SpO2), and ii) to diagnose a set of 13 pre- Gigodesign, and partners from academia, Jožef Stefan selected (core set) health conditions (Anemia, Atrial Institute and Faculties of Electrotechnics and Medicine Fibrillation, Chronic Obstructive Pulmonary Disease of the University of Ljubljana. In this review, we (COPD), Diabetes, Hepatitis A, Leukocytosis, Pneumonia, present the XPRIZE competition, we briefly look at the Otitis Media, Sleep Apnea, Stroke, Tuberculosis, Urinary ten finalists and more closely at the MESI Simplifying Tract Infection, Absence of condition) and three other diagnostics approach. Special attention is given to the conditions from an additional set (Airborne Allergens, diagnostic algorithm that was developed in order to Cholesterol Screen, Food-borne Illness, HIV, Hypertension, facilitate the diagnostic process. Hypo- and Hyperthyroidism, Melanoma, Mononucleosis, 1 INTRODUCTION Osteoporosis, Pertussis, Shingles, Strep Throat). Furthermore, the consumer experience represented an XPRIZE, formerly known as the X Prize Foundation, is a important component of the qualifying round evaluation non-profit organization that was established in order to criteria. stimulate innovation for the benefit of humanity through incentivized competition. The challenges are “audacious, Around 300 teams from all over the world entered the but achievable, tied to objective, measurable goals” [1]. The competition, with 34 teams reaching the qualifying round. In first prize from the foundation was the Ansari XPRIZE that August 2014, ten teams were chosen for the final round of offered a US$10 million prize for the first non-government the competition that will include testing the products on real organization to launch a reusable manned spacecraft into patients during the summer of 2015. The winners of the space twice within two weeks. The prize was won by an competition will be announced in January 2016. aerospace company Scaled Composites with their In this review paper, we present the ten finalists and their SpaceShipOne [2]. In the following years, other XPRIZEs approaches, based on the information made public so far. were announced, such as Google Lunar XPRIZE that Some teams unveiled several details about their products focuses on launching and landing a robotic spacecraft on while others are more secretive. We pay special attention to the Moon, with sending data back to Earth. the MESI Simplifying diagnostics team approach and the diagnostic algorithm. The Qualcomm Tricorder XPRIZE [3] was launched in January 2012. The name was inspired by the science-fiction TV series Star Trek, where “tricorder” was a device that 2 FINALIST TEAMS immediately diagnosed medical conditions of the patients. Among ten finalists, there are four teams from the United The sponsor of the prize is Qualcomm, an American States, two from the United Kingdom, and one from each semiconductor company that focuses on wireless Taiwan, Canada, India, and Slovenia. The teams are telecommunications technologies. The aim of the presented as on the XPRIZE website, except for MESI competition is to revolutionize the healthcare system by Simplifying diagnostics that is presented separately later on. developing an instrument capable of measuring some key health parameters and diagnosing a set of common medical 42 Aezon [4] is a team of student engineers from Johns Final Frontier Medical Devices [9] is a team from Paoli, Hopkins University in Baltimore, Maryland (US), with Pennsylvania (US), connected to Basil Leaf Technologies. several partners from the industry. Their solution consists of They are developing a device called DxtER, which relies on four components, each being developed by or in partnership algorithms developed by medical experience as well as on with a specialized company. The vital signs monitoring unit actual patient charts. Concept art for the product indicates is designed to wrap around the neck, like a neck support the device is roughly spherically shaped with integrated pillow, and is being designed by a startup company Aegle. sensors. The diagnostic module is exploiting microfluidic chip technology and qPCR to test for the presence of pathogens Scanadu [10] is a team from Moffett Field, California (US). and is being developed in partnership with Biomeme. The The team’s product is called Scanadu Scout, which is a disk- data is processed by a smartphone app that also uses shaped device that contains sensors for temperature, hearth algorithms to direct users towards relevant tests. In addition, rate, and blood pressure. The disk is to be held between the the phone uses software for spirometry, developed by thumb and index finger and placed on the forehead. The data SpiroSmart. The data is stored on a cloud where an API uses is transferred to a smartphone and processed there. No big data to help turn user reported symptoms into diagnostic technical specifications are known yet, neither is the solutions. The team also participated in an Indiegogo approach for the diagnostic module. Scanadu ran an campaign where they raised around 5000 US$. Indiegogo campaign from May to July 2013 and managed to raise over US$ 1.6 million. The campaign has also received CloudDX [5] is a Canadian team, associated with the considerable media coverage. company Biosign Technologies, a manufacturer of medical devices. The vital signs unit is placed around the neck; it SCANurse [11] is a team from London, UK. Their system uses two electrodes at upper chest area to monitor ECG and consists of blood, vitals, breath, and image units. No specific an ear bud with an infrared temperature sensor to measure information was provided on their website at the time of body temperature. An ear clip uses photoplethysmograph to writing. monitor breathing and heart rate. Blood pressure is measured by a wrist monitor with the pulse transit time approach. The Zensor [12] is a team from Belfast, UK, connected to diagnostic module is designed to analyze saliva, blood, and Intelesens Responsive Healthcare, a company working on urine. The team is working with industrial partners to non-invasive vital signs monitoring. Their prototype can consolidate multiple tests onto one multi-strip cassette. In detect 3-lead ECG, respiration rate, temperature, and addition, an application was developed to accept data from motion. SpO2 sensor is being developed. To diagnose fitness devices and to integrate them into the system. medical conditions, urine and blood analysis is included, Danvantri [6] is a team from Chennai, India, associated although the details have not been made public yet. with American Megatrends. The main component of their product is a handheld health monitor that features a 3-Lead 3 MESI SIMPLIFYING DIAGNOSTICS ECG electrode to measure ECG signals from the finger, pulse oximeter, an infrared temperature sensor, camera, 3- MESI Simplifying diagnostics [13] is a team from axis accelerometer for monitoring physical activity and a Ljubljana, Slovenia. The team consists of partners from the glucometer strip attachment node. Additional devices industry and the academia. The team is led by MESI, a start- include a wireless spirometer, neckband ECG/EEG meter, up company that specializes in development of medical otoscope, and urine sample analyzer. The data is processed devices. Their flagship product is an ankle-brachial index and visualized either on a smartphone or on a tablet. measuring device (ABPI MD) for the detection of peripheral arterial disease. Company D·Labs is responsible for a DMI [7] is a team from Boston-Cambridge, Massachusetts mobile app and API while Gigodesign focuses on improving (US), connected to the DNA Medicine Institute. They the user experience and industrial design. Partners from developed the rHEALTH Sensor which is a device that academia come from Jožef Stefan Institute (Department of employs fluorescence detection optics, microfluidics, and Intelligent Systems) and two faculties of University of nanostrip reagents to perform a suite of hematology, Ljubljana, Faculty of Electrotechnics and Faculty of chemistry, and biomarker assays from blood. The device Medicine. Academic partners are responsible for the was developed in collaboration with NASA to monitor development of algorithms and for expert medical astronaut health. knowledge. The system consists of several modules [14]. A bracelet monitors activity and three vital signs, ECG, SpO2, Dynamical Biomarkers Group [8] is a team from Taiwan. and temperature. A “shield” module is placed on the upper Their system consists of five components: Smart Vital- arm and consists of a wireless cuff for blood pressure Sense-Patch and Smart Vital-Sense-Wrist module; Smart measurements. It also contains a patch located on the chest Blood Sense module; Smart Scope module; Smart Exhaler to measure SpO2, temperature, ECG, respiratory rate, and module; and Smart Urine Sense module. The modules are activity tracking. Data obtained from the bracelet and the connected to a smartphone app that runs algorithms based on shield module fulfill the vital signs monitoring requirement proprietary algorithms to conduct a diagnosis. of the competition. 43 The diagnosis of medical conditions is performed with the 4 DIAGNOSTIC ALGORITHM help of a smartphone app and aims to recognize all The diagnostic algorithm was developed at the Department conditions from the core set, together with Hypertension, of Intelligent Systems of Jožef Stefan Institute by Maja Melanoma, and Strep Throat. The user, that has already Somrak, Mitja Luštrek, Matjaž Gams, and the author of this performed the vital signs measurements, indicates his review [15]. The aim of the algorithm is to predict the concern: “I feel pain” or “I feel unwell”. If “pain” is chosen, medical condition of the patient, based on the symptoms that the user specifies the type of the pain on a schematic human he or she experiences. Around 60 different symptoms are figure (such as “chest pain”). Based on vital signs data and taken into account. The problem is highly non-trivial. There the type of pain/feeling unwell, the algorithm generates a list is no simple function that would map the domain of a group of possible symptoms that the user may experience. This list of symptoms to a codomain containing a single disease. is generated to include both the symptoms that the user most People with the same medical condition may experience probably experiences at the time and would probably want different symptoms, for example, people with Otitis Media to report, and also the most relevant symptoms that would may or may not experience a headache or a discharge from help the physician or the diagnostic method set a reliable the ear. An individual symptom is usually typical for several diagnosis. Based on the chosen symptoms, the algorithm different diseases. For example, elevated temperature is then asks for a couple of additional symptoms in order to typically exhibited in cases of Tuberculosis, Pneumonia, narrow down the diagnosis and direct the user to one or Strep Throat, Otitis Media, and others. On the other hand, more specialized modules that confirm or reject the even healthy people (“absence of conditions”) often suggested diagnosis. There are four specialized modules. A experience some symptoms due to reasons that are not module “To see” includes a camera which is used to connected to diseases. Fatigue may be related to a lack of diagnose Melanoma and Strep Throat. Using a special sleep while high blood pressure may be a consequence of camera is advantageous to using the integrated camera in a drinking caffeinated drinks. In addition, asking the patient smartphone since the specifications of phone cameras may for all symptoms on the list is not considered user-friendly, vary from a model to a model. In addition, light conditions therefore the goal is to diagnose the medical condition as are easier to control with a dedicated module. A module “To accurately as possible using as small number of questions as hear” includes microphones that are used to monitor possible. In order to achieve the best performance, the breathing – in order to detect pulmonary diseases. This algorithm combines expert medical knowledge and methods module also allows user to perform a spirometry which is of artificial intelligence. At this point, we only aim to used to diagnose COPD. The urine module, “Pee”, performs diagnose the diseases of patients with a single medical urine analysis using test strips and a camera that reads the condition. Diagnosing a combination of more than one test results. The fourth module is called “Blood” and is disease for a single patient is a next-level problem. intended for blood tests. In order to achieve best user experience, this module should rely on non-invasive As discussed above, the initial input for the algorithm comes methods, such as spectroscopy, although it is more likely from the vital signs measurements (symptoms such as that a drop of blood will be required for analysis in the final elevated temperature or high blood pressure) and from the version. This module is intended for detection of diabetes pain symptoms that the user chooses. Additionally, for and anemia. personalized tests, the algorithm may also include identified risk factors for a particular user (from the algorithm point of view, we also treat the risk factors as “symptoms”). For example, smokers and older people are more likely to develop COPD than non-smokers, people with a high BMI have higher risks for diabetes, etc. All these are called the “initial symptoms”. The additional list of suggested symptoms is generated using association rules (ARs) and the minimum-Redundancy-Maximum-Relevance (mRMR) method. The ARs (symptom A –> symptom B) are used to produce a set of probable additional symptoms. The goal of the mRMR method is to select symptoms that are as mutually dissimilar as possible and at the same time as indicative of the medical condition as possible. In other words, the algorithm tries to avoid asking the user about several similar symptoms and at the same time ask about symptoms that cover all spectrum of probable medical Figure 1: MESI Simplifying diagnostics system: a bracelet, smartphone app, and four diagnostic modules – To see, To conditions. hear, Blood, and Pee. The Shield module is not shown here, it comes in form of a sleeve with attachable electrodes. In the following step, the information gathered up to this point is used for actual disease prediction. The probabilities 44 for the 15 medical conditions are evaluated using a set of patient to use a specialized module which confirms or rejects J48 classifiers, one for each of the conditions. There are two the prediction. probability thresholds: conditions above the high threshold are considered very probable and conditions below the low An overview of the algorithm, developed at Jožef Stefan threshold are considered unlikely. The area between the two Institute, is presented. The algorithm combines expert thresholds is a so-called “gray zone” where we do not have medical knowledge with methods of artificial intelligence enough information to make a reliable claim whether the and machine-learning. The aim of the algorithm is to make medical condition is present or not. The diagnostic an accurate prediction of diagnosis with a small number of procedure terminates when all conditions from the list are questions, to improve the user experience. We outline the either above the high or below the low threshold. If one or challenges of the task. Testing of the algorithm on real more condition remain in the gray zone, at least one patient data is currently underway and the results will be additional question (symptom) is required for a confident published later. prediction. The additional symptom is chosen according to the highest information gain (IG) that an individual Reference symptom would bring. [1] http://www.xprize.org/ [2] http://space.xprize.org/ansari-x-prize Calculation of the IG and mRMR values, searching for ARs, [3] http://www.qualcommtricorderxprize.org/ and building the J48 classifiers is based on two types of data [4] http://www.aezonhealth.com/ – real and simulated patient data. Real patient data was [5] http://www.clouddx.com/ collected either with both patients with medical conditions [6] http://www.vitalsplus.com/ and healthy individuals filling in a questionnaire about the [7] http://www.dnamedinstitute.com/ symptoms they experience (a complete set of symptoms), or [8] http://dbg.ncu.edu.tw/ by medical doctors retroactively filling in the symptom [9] http://www.basilleaftech.com/ tables for real patients. The simulated dataset was build [10] https://www.scanadu.com/ using expert medical knowledge. Physicians prepared a table [11] http://www.scanurse.com of probabilities for patients suffering from each of the [12] http://www.intelesens.com medical conditions to exhibit each of the symptoms from the [13] http://www.simplifyingdiagnostics.com/ list, based on their professional experiences. Using this [14] M. Somrak, M. Luštrek, J. Šušterič, T. Krivc, A. table, it is possible to generate millions of distinct “virtual Mlinar, T. Travnik, L. Stepan, M. Mavsar, M. Gams: patients”. Initial tests using only simulated data showed high Tricorder: Consumer Medical Device for Discovering sensitivity and specificity for disease diagnostics [15]. Tests Common Medical Conditions, Informatica 38 (2014) using a combination of real and simulated data are currently 81–88. underway. [15] M. Somrak, A. Gradišek, M. Luštrek, A. Mlinar, M. Sok, M. Gams: Medical diagnostics based on combination of sensor and user-provided data. AI- 5 CONCLUSIONS AM/NetMed 2014, Artificial Intelligence and Assistive We present an overview of the Qualcomm Tricorder Medicine: proceedings of the 3rd International XPRIZE competition and the teams that reached the final Workshop on Artificial Intelligence and Assistive round, with a special focus on the Slovenian team entry. The Medicine co-located with the 21st European Conference approaches of many teams are similar to some degree. The on Artificial Intelligence (ECAI 2014), Prague, Czech most common approach is to use of single a device with a Republic, pp. 36-40 number of integrated sensors to monitor vital signs (the first competition task). The second task, the diagnosis of medical conditions, is typically achieved using a series of dedicated additional modules. Some teams rely strongly on detection of biomarkers in body fluids while others also incorporate technologies such as spirometry and image-processing algorithms. Several teams mention they use algorithms for diagnostics, although not much has been revealed to the public so far. The MESI Simplifying diagnostics approach consists of a bracelet and a “Shield” module to monitor vital signs. The diagnosis of medical conditions is obtained using an algorithm that runs on a mobile device. The algorithm uses the vital signs data and the symptoms entered by the patient to predict a possible medical condition and to direct the 45 AVTOMATIZACIJA IZGRADNJE BAZE ODGOVOROV VIRTUALNEGA ASISTENTA ZA OB ČINE Leon Noe Jovan, Svetlana Nikić, Damjan Kužnar, Matjaž Gams Odsek za inteligentne sisteme, Institut “Jožef Stefan”, Jamova cesta 39, 1000 Ljubljana e-mail: leon.jovan@gmail.com POVZETEK vprašanja je zelo zamudno, kar lahko vpliva na to, koliko občin bo sodelovalo pri projektu. Zato se pojavlja potreba po Asistent je inteligentni virtualni pomočnik, ki odgo- avtomatizirani rešitvi, ki bi ustvarila odgovore za posamezno varja na vprašanja, postavljena v naravnem jeziku in je občino za celotno zlato osnovo. sposoben poiskati odgovore na spletnih straneh. Cilj pro- Osnovna ideja naše rešitve je, da poskušamo čimbolj av- jekta Asistent je vzpostavitev spletne storitve za izdelavo tomatizirati vnašanje podatkov v bazo asistenta. S klasi- in urejanje prilagojenega virtualnega asistenta, ki si ga fikacijo želimo določiti, na kateri strani spletni strani občine bodo lahko občine namestile na svoje spletne strani in se nahaja podatek, ki ga zahteva posamezno pravilo. S tako obiskovalcem olajšale iskanje informacij, ki jih stran kratkimi skriptami pa želimo nato iz spletne strani pridobiti ponuja. Ta prispevek opisuje postopke avtomatizacije iz- podatek ter ga prikazati v kratkem, uporabniku prijaznem gradnje baze asistentovih odgovorov z uporabo različnih odgovoru. pristopov strojnega učenja ter ekstrakcije informacij iz Na koncu projekta bi tako vsaka občina lahko imela sebi spletnih strani. Ta postopek bo olajšal delo občin pri prilagojenega asistenta, s čimer bi vsi občani dobili možnost uvajanju asistenta, kar predvsem vpliva na razširjenost naravnega poizvedovanja in komuniciranja z občinami. uporabe na spletnih straneh občin. Opisana je os- novna ideja arhiterkture sistema za avtomatizacijo grad- Problem je soroden ekstrakciji informacij (angl. Informa- nje odgovorov virtualnega asistenta, predvsem pa so pred- tion extraction) iz HTML dokumentov [4]. Svetovni splet je stavljeni pristopi za generiranje odgovorov in ekstrakcijo zbirka velike količine dokumentov, vendar pa podatki niso na- podatkov iz spletnih strani. jbolje strukturirani. Naša naloga je, da iz takšnih nestrukturi- ranih podatkov najdemo podatke, ki so za nas uporabni. V zadnjem času je bilo predlaganih več različnih pristopov 1 UVOD za ekstraktcijo informacij iz spleta. Pristopi vključujejo Večina slovenskih spletnih strani je omejena z uporabo uporabo strojnega učenja, iskanja vzorcev in podobno z ra- starejših spletnih tehnologij, kar povzroča oteženo iskanje po zličnimi stopnjami avtomatizacije [5]. njih. Splošni iskalniki v povprečju najdejo le med 10% do 30% ustreznih odgovorov [13]. Ena izmed možnih rešitev je 2 PROJEKT ASISTENT inteligentni virtualni pomočnik oz. asistent, ki zna odgov- oriti na vprašanja v naravnem jeziku. Asistenti se pojavljajo Celoten proces izgradnje baze odgovorov je sestavljen iz treh kot pomoč pri iskanju po spletnih straneh, pametnih telefonih, korakov, in sicer pridobivanje in priprava podatkov, klasi- itd. Ena najbolj poznanih asistentk na svetu je npr. Siri, ki jo fikacija in generiranje odgovorov. najdemo na novejših sistemih iOS podjetja Apple Inc. in se jo Najprej moramo podatke pridobiti in jih pripraviti za lahko uporablja v več svetovnih jezikih. Prva virtualna asis- nadaljno obdelavo. Podatke pridobimo s spletnim pajkom, tentka v slovenščini pa je Vida, ki je nastala kot pomoč pri ki obišče vse spletne strani občine in jih v obliki HTML iskanju po straneh DURSa. dokumenta shrani v interno bazo podatkov. Iz teh datotek Cilj celotnega projekta [12] je ustvariti virtualne asistente nato izluščimo celotno besedilo, ga lematiziramo in označimo za slovenske občine, ki bi bili potem lahko dostopni preko besede z označevalnikom, saj bomo te podatke uporabili v njihovih spletnih strani. naslednjih korakih. Ta dva postopka naredimo z lematizator- Vsaka baza znanja za neko občino je sestavljena iz vnosov, jem LemmaGen [9] in Oblikoslovni označevalnik za sloven- vsak vnos pa vsebuje vprašanje, imenovano pravilo, in ski jezik [7]. Spletnega pajka za shranjevanje spletnih strani odgovor, pri čemer so pravila ključne besede vprašanj, ki jih smo izdelali z uporabo Java knjižnjice Jsoup [6]. zastavljajo uporabniki. Vsak asistent občine ima približno Naslednji korak je klasifikacija, kjer moramo pridobljene 500 pravil, ki so enaka za vse občine, odgovore pa je potrebno spletne strani razvrstiti med skoraj 500 vnosov. Za klasi- kreirati za vsako posamezno občino. Tem pravilom pravimo fikacijo uporabimo lematizirana besedila spletnih strani, ki “zlata osnova”. Zlata osnova je bila oblikovana iz pravil in smo jih predstavili kot vrečo besed [2], z mero TF-IDF [3] pa odgovorov, ki so jih določene občine ročno vnesle v svoje izberemo le najbolj pomembne besede, saj je preveč različnih asistente na začetku projekta. Ročno vnašanje odgovorov na besed, da bi obravnavali vse. 46 Figure 1: Arhitektura sistema Za generiranje odgovora za določen vnos imamo torej http://www.uradni-list.si/1/search?smode=ul& na voljo spletno stran, ki smo jo uvrstili, da vsebuje po- cmd=search&q=iskalni_niz&rubm=s& datke primerne temu pravilu, poleg pa tudi lematizirano in selectItem=id_obcine&rublist%5B%5D= označeno besedilo te strani, včasih pa tudi katere od drugih id_obcine virov (Wikipedia, Uradni list, ...). Iz teh podatkov torej gener- iramo kratek odgovor, ki ga prikaže virtualni asistent. Iskalni niz smo določili glede na pravilo, ID številko občine pa smo prebrali iz spletne strani tako, da smo poiskali ustrezen HTML element, ki se je nahajal med možnostmi za 3 AVTOMATIZACIJA IZGRADNJE BAZE ODGOV- filtriranje in je vseboval ime občine ter ID za filtriranje po OROV tej občini. Za pravilno delovanje smo vse šumnike pretvo- Izdelali smo program, ki je zmožen zgraditi bazo odgovorov rili s Percent-encoding [8], ki se uporablja za kodiranje ne- 471 vprašanj, ki so skupna vsem občinam (“zlata osnova”). standardnih znakov v URL naslovih. Rezultate iskanja smo Sistem za generiranje odgovorov je sestavljen iz 471 razre- nato prebrali iz vrnjene spletne strani. Če je vrnjen rezultat dov, izdelanih v programskem jeziku Java, ki so poimeno- samo eden, smo povezavo do tega elementa dodali v odgovor, vani po ključu Rule < IDpravila > glede na pravilo, za če pa je rezultatov iskanja več, se v odgovor dodajo prvi trije katerega generirajo odgovor. Razred, ki generira odgovor, dokumenti s pripisom, da se več dokumentov nahaja na strani potrebuje le spletno stran, iz katere črpa informacije, v neka- v ozadju. terih primerih pa tudi lematizirano in označeno besedilo te spletne strani. Osnoven razred je RuleClass, ki omogoča 3.2 Povzetki branje vseh potrebnih podatkov ter vračanje odgovora. Izde- lali smo hierarhijo razredov, ki dedujejo in so nadgradnja Vprašanja so v nekaterih primerih precej splošna in običajno RuleClass razreda, omogočajo pa generiranje odgovorov, ki zahtevajo daljši odgovor, ki ni primeren za asistentov so bolj specifični in rešujejo določen problem, ki se pogosto odgovor. Zato asistent odgovori s krajšim povzetkom, več pojavlja. Takšni problemi so pridobivanje obrazcev, kontak- o tem pa si uporabnik lahko prebere na strani, ki se mu tov, imen oseb, povzetkov iz daljšega besedila in povezav. prikaže v ozadju. Takšna vprašanja so na primer opisi kul- Pristopi za reševanje takšnih problemov so opisani v nadal- turnih znamenitosti, kmetijstva, predstavitev grba ter zastave jevanju. in podobno. Algoritem deluje na principu ključnih besed in regularnih 3.1 Vloge in obrazci izrazov, ki jih predhodno podamo za posamezno pravilo. Po- damo lahko seznam ključnih besed, ki jih algoritem pretvori Za pridobitev obrazcev, ki so specifični za vsako občino, v regularne izraze. Potem iz celotnega HTML dokumenta smo uporabili spletno stran Uradni list [11], ki objavlja za- rekurzivno odstranimo vse vrstične elemente, njihovo vsebino kone, predpise in druge javne objave v Republiki Sloveniji. pa dodamo staršu tega elementa. Ta postopek nam omogoča S pomočjo iskalnika na spletni strani računalnik pridobi lažje določanje besedilnih enot, saj nam vmesni vrstični ele- določen obrazec, ki ga potem posreduje uporabniku v obliki menti ne delijo enega odstavka na več delov. Po končanem odgovora. postopku odstranjevanja vrstičnih elementov, algoritem pre- Do dokumentov na spletni strani Uradni list smo dostopali gleda vsa vozlišča z besedilom (TextNode) in prešteje, ko- tako, da smo naredili zahtevek z ustreznimi GET parametri, likokrat se elementi iz seznama regularnih izrazov pojavljajo ki od strani zahtevajo, da nam poišče določene dokumente. v besedilu. Na podlagi te ocene algoritem izbere najboljši 47 odstavek in ga prikaže kot odgovor. dobro deluje generiranje odgovorov. Za učni primer smo vzeli občino Pivka in njene spletne strani ter vnose, saj ima najbolj popolne odgovore. Kvaliteto 3.3 Kontakti ustvarjenih odgovorov smo preizkusili na podlagi 16 drugih Nekateri kontakti so na voljo samo na straneh občine in jih je občin. Ročno smo pregledali vse vnose in strani, katere so potrebno pridobiti neodvisno od oblike strani. Sem spadajo vpisale občine, in izločili tiste, ki niso bile primerne. Tako na primer kontakt direktorja občine, svetovalcev in podobno. smo izločili vpliv napačne klasifikacije na kvaliteto ustvar- Kontakt je v osnovi sestavljen iz imena, naziva, telefonske jenih odgovorov. S tem smo pri nekaterih občinah močno številke ter elektronskega poštnega naslova oziroma kombi- zmanjšali število pravil, saj je bilo veliko podanih spletnih nacije le-teh. Najprej je bilo tako potrebno iz besedila pre- strani oziroma odgovorov napačnih ali pa so bili prepisani od poznati te elemente. neke druge občine. Rezultat generiranja odgovorov smo pre- Za prepoznavanje imen smo naredili bazo imen, pri čemer gledali ročno, jih primerjali z ročno vpisanimi odgovori, ki so smo za osnovo vzeli podatke z Wikipedie [1]. V besedilu smo jih izdelale občine, ter izračunali delež ustreznih odgovorov. nato lahko imena preprosto našli s pregledom baze. Odgovor je bil ocenjen kot ustrezen, če je vseboval iskane po- datke v uporabniku prijazni obliki. Sem torej niso šteti odgov- Za iskanje telefonskih številk smo si pomagali s knjižnico ori, ki pozovejo uporabnika, naj si ogleda stran v ozadju. Libphonenumber [10], ki najde v podanem besedilu vse tele- Rezultate ocenjevanja ustreznosti ustvarjenih odgovorov fonske številke neke države. prikazuje spodnja tabela 4. Prvi stolpec predstavlja občino, Elektronske naslove smo iskali s pomočjo regularnega za katero smo preverjali rezultate, drugi število vseh pravil, ki izraza: jih je občina vpisala in tretji število pravil, ki imajo pravilne ali vsaj delno pravilne odgovore. Zadnja dva stolpca pred- [_A-Za-z0-9-\\+]+(\\.[_A-Za-z0-9-]+)*@[A-Za- z0-9-]+(\\.[A-Za-z0-9]+) stavljata rezultate generiranja odgovorov. *(\\.[A-Za-z ]{2,}) Občina Vnešena Ustrezna Ustrezni Neustrezni pravila pravila odgovori odgovori Če si spletno stran predstavljamo kot drevesno strukturo Koper 343 311 (91%) 235 (76%) 76 (24%) HTML elementov, je ta algoritem iskal v tem drevesu najnižje Velenje 388 347 (89%) 269 (78%) 78 (22%) ležeče vozlišče, ki vsebuje vse te elemente - ime, naziv (po- Hrastnik 399 363 (91%) 268 (74%) 95 (26%) Hrpelje - Kozina 457 395 (86%) 314 (79%) 81 (21%) dane ključne besede), telefonsko številko in/ali poštni naslov. Kidričevo 209 187 (89%) 165 (88%) 22 (12%) Da bi bil odgovor v uporabniku prijazni obliki, smo odstranili Krško 456 375 (82%) 297 (79%) 78 (21%) še razne stile in vrnili HTML iz vozlišča kot odgovor. Log - Dragomer 294 244 (83%) 206 (84%) 38 (16%) Mengeš 338 270 (80%) 230 (85%) 40 (15%) Podvelka 202 193 (96%) 170 (88%) 23 (12%) Vodice 132 128 (97%) 113 (88%) 15 (12%) 3.4 Osebe Vransko 257 235 (91%) 202 (86%) 33 (14%) Celje 452 431 (95%) 350 (81%) 81 (19%) Nekatera pravila zahtevajo prepoznavanje oseb v besedilu Litija 456 264 (58%) 228 (86%) 36 (14%) in njihove vloge. Takšno pravilo je na primer ugotavljanje Šentjur 373 330 (88%) 247 (75%) 83 (25%) ˇ župana ali podžupanov občine. Za ta postopek smo upora- Zalec 448 425 (95%) 346 (81%) 79 (19%) Trbovlje 286 258 (90%) 212 (82%) 46 (18%) bili Oblikoslovni označevalnik za slovenski jezik [7], ki nam Povprečje 343 88% 82% 18% določi sklon, spol, število, besedno vrsto ter ugotovi ali gre za obče ali lastno ime. Vsi rezultati so bili narejeni na podlagi ročne klasifikacije. Župana na spletni strani smo prepoznali tako, da iščemo Nadaljnje delo tako vsebuje še klasifikacijo spletnih strani os- vsa zaporedja besed, dolga vsaj dve besedi, ki so sestavljena talih občin. Ker klasifikacija ne bo delovala popolno, lahko samo iz samostalnikov v ednini in so lastna imena. Ta za- pričakujemo nekoliko slabše rezultate. poredja besed se lahko začnejo tudi z besedami mag., dr., ali Ustreznih generiranih odgovorov je 82% odstotkov, kar pa župan v našem primeru. Vsa ta zaporedja besed še prever- ocenjujemo za zadovoljiv rezultat, ki olajša človekovo delo. imo, če vsebujejo eno od osebnih imen, ki smo jih pridobili na Višji odstotek so vrnili odgovori tistih občin, ki so imele manj wikipediji. Tako predpostavimo, da gre res za ime osebe. Na pravilno podanih povezav, saj so večinoma ostale splošne koncu izberemo osebo, ki se največkrat pojavlja v besedilu na povezave, ki so za vse občine podobne in jih je lažje generi- spletni strani. rati. 4 OBJEKTIVNA EVALVACIJA REZULTATOV 5 Zaključek Za evalvacijo rezultatov smo kot učno množico uporabili po- V tem dokumentu smo opisali postopke avtomatizacije iz- datke 16 občin, kjer so bile spletne strani ročno klasificirane. gradnje baze asistentovih odgovorov z uporabo različnih Klasifikacije za ostale občine nismo testirali, saj še ni popol- pristopov strojnega učenja ter ekstrakcije informacij iz splet- noma narejena, a to ne vpliva na testiranje generatorjev, na kar nih strani. Opisana je osnovna ideja arhiterkture sistema za smo se mi osredotočili, saj smo najprej želeli preveriti kako avtomatizacijo gradnje odgovorov virtualnega asistenta, pred- 48 vsem pa so predstavljeni pristopi za generiranje odgovorov in edge and Data Engineering, let. 18 št. 10, str. 1411-1428, ekstrakcijo podatkov iz spletnih strani. 2006. Preizkusili smo del sistema, ki iz spletnih strani prido- biva informacije in generira kratke odgovore na določena [6] Jsoup, Java HTML Parser. URL http://jsoup. vprašanja. Kvaliteta ustvarjenih odgovorov je zadovoljiva, org/. Pridobljeno 16. 9. 2014. 82% odstotkov odgovorov je bilo ustreznih, kar ocenjujejmo, da je dovolj, da olajša človeško delo. [7] Oblikoslovni označevalnik za slovenski jezik. URL Za nadaljnje delo načrtujemo izdelavo klasifikacijskega http://www.w3schools.com/tags/ref_ modela, ki bo za vsako pravilo poiskal spletno stran občine, urlencode.asp. Pridobljeno 16. 9. 2014. na kateri se nahaja ustrezen podatek. Od klasifikacije je zelo odvisen tudi generator odgovorov, ki smo ga izdelali, saj [8] HTML URL Encoding Reference URL http: lahko deluje precej slabše ob slabši klasifikaciji. //oznacevalnik.slovenscina.eu/Vsebine/ Sl/ProgramskaOprema/Navodila.aspx. Pridobljeno 16. 9. 2014. References [1] Seznam osebnih imen. URL [9] LemmaGen, Multilangual Open Source Lemmatisation. http://sl. URL http://lemmatise.ijs.si/. Pridobljeno wikipedia.org/wiki/Seznam_osebnih_ 16. 9. 2014. imen. Pridobljeno 1. 9. 2014. [2] Bag-of-words representation of text. URL https: [10] libphonenumber, Google’s phone number handling //inst.eecs.berkeley.edu/˜ee127a/book/ library URL https://code.google.com/p/ login/exa_bag_of_words_rep.html Pri- libphonenumber/. Pridobljeno 16. 9. 2014. dobljeno 5. 9. 2014. [11] Uradni list Republike Slovenije URL http://www. [3] Tf-idf weighting. URL http://nlp.stanford. uradni-list.si/. Pridobljeno 18. 9. 2014. edu/IR-book/html/htmledition/ [12] Projekt Asistent, Virtualni asistent za občine in tf-idf-weighting-1.html Pridobljeno 5. društva. URL http://www.projekt-asistent. 9. 2014. si/wp/. Pridobljeno 18. 9. 2014. [4] InformationExtraction URL http://en. [13] Projekt Asistent, Virtualni asistent za občine in wikipedia.org/wiki/Information_ društva, opis projekta. URL http://www. extraction Pridobljeno 3. 9. 2014. projekt-asistent.si/wp/?page_id=100. [5] C. Chang, M. Kayed, M. R. Girgis in K. F. Shaalan. A Pridobljeno 18. 9. 2014. Survey of Web Information Extraction Systems. Knowl- 49 INFERRING MODELS FOR SUBSYSTEMS BASED ON REAL WORLD TRACES Rutger Kerkhoff, Aleš Tavčar1,Boštjan Kaluža1 Jožef Stefan Institute, Jamova cesta 39, 1000 Ljubljana, Slovenija1 e-mail: rutger.kerkhoff@gmail.com, {ales.tavcar, bostjan.kaluza}@ijs.si ABSTRACT As a proof of concept this article will focus on simulating a single smart house. While this model will be Creating simulations for smart cities is a complex and time consuming task. In this paper we show that using significantly less complex, and even allow for exact traditional Bayesian networks and real world data simulation, it is ideal for showing the power of this traces it is possible to infer models that can simulate the approach. In this article we will show that using a manually original domain. The created model can provide great defined Bayesian network already allows for quite accurate insight into the actual subsystems that are considered. predictions. Expanding to an automatically learned network We show that given a set of observed values we can improves these even more, showing great promise for successfully use the created model to simulate data and modelling a complete smart city. show trends present in the original system. 2 RELATED WORK 1 INTRODUCTION Using a Bayesian network to model a system is not a new With modern cities becoming more complex and ever idea, take for example a water supply network [5], where increasing in size it is of vital importance to control and the authors try to predict when a pipe will burst. It has also optimize the different systems present in a city. While the been used to model mobility within a city [3] or to predict different systems already available in a city, e.g. the power when to replace parts of the New York power grid [2]. grid, waste management, bus scheduling etc., can and are While Bayesian networks have not yet been used to model a being optimized, the bigger picture has not been explored complete city, there are approaches which try to tackle large yet. Connecting them allows further optimisation and or distributed models. For example a hierarchical object realises emergent behaviour; something that cannot be oriented approach [7] or building local networks and observed when looking at a subsystem alone. identifying which variables likely link the systems [6]. An important aspect of realising such a system is While not needed for the model of the smart house understanding the relations between the subsystems and discussed in this paper these techniques will likely be even within the subsystems themselves. A concrete example necessary when considering a smart city. As an example of of where this knowledge is needed is simulation. It is of what possible data streams and variables can be found in a vital importance to thoroughly test such a city-controlling smart city one can look at sensor data streams in London system before applying it, and for that simulations are [8]. required. A simulation will also allow faster development of While other papers focus mainly on classification or new systems and applications within the smart city. decision support systems [4], we focus on recreating the However, simulating a smart city is an immensely complex original data and trends for a smart environment. We are not task, and in all but the simplest cases it is infeasible to aware of any complete simulation and creation of a trace specify all the variables and relations concerned in such a using a Bayesian network. simulation. We therefore propose to learn a model, representing all the variables and relations in the smart city, 3 DOMAIN from real-world data traces. The domain explored in this paper is that of a smart house, To allow for a better understanding of the system and which is less complex than a city, but it is well suited to the ability to simulate new traces we use a graphical model, show the ideas and explore the methodology. The traces a Bayesian belief network [1] to be exact. In this article we used to model, learn and simulate the smart house are will focus on different ways of creating the network and obtained from the EnergyPlus simulator [14]. The simulator inferring the probabilities. The ability to graphically was developed within the OpUS system [13] and it is based represent the network makes it an ideal tool for on a real world building. While simulating a simulator understanding the modelled system. Moreover, we can input seems pointless, it allows for testing of the model in varying a set of observed variables and update, according to those situations without having to gather real world data for an values, the probabilities of the unobserved which makes it a extended amount of time. For simulating the city this will be suitable tool for creating simulations. necessary. 50 The simulation has 6 input variables, all concerning the Here x are the different possible outcomes for that node. environment. One should think of values for outside Simply counting the occurrences in our training data gives temperature, wind etc. The simulator also supports setting if us a probability table for every possible outcome of X the occupant is present, the house was set to be empty from depending on every possible outcome of Xi, ...Xj . Because 7:00 to 17:00 every day. Redundant output variables were not every possible combination is observed we use a prior α filtered out and we are left with 9 variables, ranging from = 0.5 as an initial count on each variable. inside temperature, heating coil power consumption to As the observant reader might have noticed the equation electricity produced by a solar panel. All variables are 2 requires discrete variables, and thus we discretized all our recorded in an interval of 15 minutes and are continuous. variables into 10 equal sized buckets. The partition intervals The subsystems that we want to simulate in the city can be where chosen based on a minimum description length seen as the different devices in the house, e.g. the solar principle. For more information and a formal definition we power generator reacting to an increase in solar radiation. refer the reader to Fayyad et al. [12]. We input the values of our observed testing variables 4 METHOD and use this knowledge to calculate the remaining variables. This is done by the junction tree algorithm [10], it To learn a model of the house we use Bayesian networks. calculates the marginal distribution for each unobserved We decided on using Bayesian networks as they do not only node based on the values of the observed nodes. We take perform well when predicting, they also have an the most likely value for each node, set it as evidence and understandable structure that can give insight to the update the margins, till all the variables are set. situation being modelled. We assume the reader to be familiar with the subject as described by Heckerman et al. 5 RESULTS [1]. In this section we will explain the specific parameters and choices made for our implementation. In this section we present the experimental results for the Creating a Bayesian network can be seen as a two stage models presented in the previous section. In section process. First, we define a network structure, nodes and how 5.1 we will look at the structure of the networks and in variables are related as a directed acyclic graph. Second, we section 5.2 we will analyse the performance of the complete must set the probabilities distributions of all the different models. nodes depending on their ancestors. We define our network The data used was obtained from running the as G, our nodes as V and the arcs between the nodes as A. EnergyPlus simulator. We used two days of simulated data for training the model and two days for testing. The G = ( V, A) (1) variables were all recorded at 15 minute intervals. The During the first stage the structure of the network can be concrete implementation was done in Java using the WEKA defined either manually or automatic. First, we define it library [11]. manually, leveraging our knowledge of the system to define 5.1 Network Structure which nodes should be related. An advantage of this is that we will not get an over-fitted network based on the We first created a manual structure of the network. Since coincidences on our training data. A drawback is that we the selected domain is relatively simple with only 17 might miss relations we did not know beforehand, and that variables it could be largely comprehended by the analysis for larger networks defining a structure quickly becomes a of the data. We looked at correlations between the complex task. variables, used our knowledge of the physical processes and Because the drawbacks will become more significant in experimented with a few possible networks to get a network the smart city we also implement an automated approach to that covered all the variables. Because of computational learn the structure from the data. We use a local search challenges some relations were simplified to restrict the based metric, the “Look Ahead in Good Directions, LAGD, number of parents of a node. Precautions were taken to hill-climbing algorithm [9]. It looks at a sequence of best make sure no relations were lost, for example, the solar future moves, instead of a single move as is usually done for radiation influencing the solar panel which in turn hill climbing. The algorithm to calculate a sequence has influences the battery. The arc from solar panel to battery exponential time complexity, and therefore it first computes was removed and instead it was linked directly to the the best moves and then decides on a sequence. It scoring radiation. This goes at the cost of a little accuracy but does function considers conditional mutual information between not lose the relation. In Figure 1, a network created using the nodes. For a formal definition please see Abramovici et the LAGD hill climber algorithm with at maximum 1 parent al. [9]. is shown. The figure does not show all the variables, as the The second problem, learning the probability nodes that do not have any arcs are left out for the sake of distributions, is solved by calculating direct estimates of the readability. The reasons why certain variables do not have conditional probabilities using a set of training data. For a arcs will be discussed in the next subsection. It is single node X ∈ V we define the probability that X = x as: interesting that a lot of arcs seem to be reversed. Note; however, that setting evidence for any variable will update P (x) = P (X = x|Xi, ..., Xj ) (2) 51 Figure 1: Network generated by LAGD Hill climber margins throughout the network, and thus even a reverse learned network. For the heating coil and the washing relation is still captured. Automatic network generation can machine even the baseline has an exceptionally good model inconsistent relations, for example, Net Purchased prediction power. This is because they are off in all but a few Power related to Wind speed seems to be over fitted on a cases, showing the importance of complete training data. The coincidence in the data. The other curious relations, like manual network performed better on a subset of the Solar Produced and Inside temperature can be explained by variables, for example power used by lighting. Lighting can the fact that all the input variables are closely related, on a be clearly linked to time, this was not found by the LAGD warm day there will be more solar radiation etc. algorithm as it does not occur often in the training data. However, in most cases automatic structure generation did 5.2 Performance perform better. Some of the interesting variables are the To test the performance of the networks we computed the predicted values of inside temperature (Fig. 2b), solar power percentage of correctly predicted values per variable and produced (Fig. 2c) and battery charged state (Fig. 2a). Note the root mean-square-error (RMSE). However, even more that the predicted data was discretized into buckets and for important than predicting the right values is to predict the each datapoint the average of the bucket was used for the trends in the data. A few erroneous spikes are of lesser calculation of the RMSE and as data points in the graphs. importance then missing a trend. Therefore we also plotted The predicted inside temperature over time shown in the data and did a visual analysis of the results. Figure 2b follows the general trends for both models, though the manual network is over-fitted on outside temperature and Baseline Manual Automatic clearly performs worse. Another problem is that because of Variable % E % E % E the discretisation some small changes are amplified. In Figure 2c the prediction of power produced by the solar Battery charge [J] 8 85e5 82 21e4 74 47e4 panel is shown. It is closely correlated to one of the input Heating coil 99 1.15 99 1.15 99 1.15 variables, solar radiation, and is therefore quite accurately Washing machine 98 4.27 98 4.27 98 4.27 predicted. The second production increase is not seen in the Total demand 51 87.54 51 87.54 60 15.17 graph for the manual network as the values are still in the Total produced 67 84.32 67 84.32 76 28.52 range of the first bucket. Cooling coil 69 46.49 35 49.45 77 22.25 Predicted battery charge state over time can be seen in Net purchased 65 35.61 65 35.61 65 35.61 figure 2a. This is difficult to predict as it depends on many variables. As can be seen the manual network is over-fitted Solar produced 60 88.31 81 34.58 82 35.37 on a wrong variable; time. The first charge peak happens to Lights 72 35.53 88 12.46 72 35.53 coincide in the training and testing data and therefore the Inside temp [C] 34 2.62 30 0.6 72 0.24 percentage of correct prediction is still high. The automatic Overall 62 70 77 network is not conclusive in predicting a trend. For variables like a battery, which cannot drop quickly and get back up Table 1: All variables are power in watt [W] if not stated again, it will be a valuable extension to consider the previous otherwise. The columns labelled % depicts the percentage state as well. correctly classified. E is the RMSE, lower is better. In general, the automatically constructed network model In Table 1 the percentage of correctly predicted values performed better than the manually constructed one and and the RMSE for each variable for three separate possibly correct but yet unknown relations were found. The classifications is shown. First a baseline where the most variables could be predicted relatively accurately and most probable bucket was chosen based on just priors. Second the trends present in the original data could also be found in the manually constructed network and last the automatically generated data. 52 Figure 2: Predictions of system parameters 4 CONCLUSION [4] Lanini S., Water management impact assessment using a Bayesian network model, Proceedings of the 7th We have shown that it is possible to build complex models Conference on Hydroinformatics, 2006. from real world traces that model relations between [5] Babovic V., Drcourt J., Keijzer M., and Hansen P. F., A subsystems. These models can then be used to simulate the data mining approach to modelling of water supply system and generate more data based on a set of input assets Urban Water, vol. 4, no. 4, 2002. variables. We have seen that trends in the data can be [6] Rong C., Sivakumar K. and Kargupta H., Collective modelled and even single predictions can be used as an mining of Bayesian networks from distributed indication of expected data. heterogeneous data, Knowledge and Information While automatic generation of a Bayesian network for a Systems, vol 6, no. 2, 2004. domain is possible, some expert knowledge will still be [7] Molina, J. L., John Bromley, J. L. Garca-Arstegui, C. required to reduce over-fitting due to coincidences in the Sullivan, and J. Benavente, Integrated water resources data, and to improve the network. Due to the nature of the management of overexploited hydrogeological systems Bayesian networks the cooperation with a domain expert using Object-Oriented Bayesian Networks, can easily be established. Environmental Modelling & Software, vol. 25, no. 4, The biggest drawback of this method is that it largely 2010. depends on the available data. A lack of data will lead to [8] Boyle D., Yates D. and Yeatman E, Urban Sensor Data incomplete or incorrect models. Streams: London 2013, IEEE Internet Computing, vol. The most important direction for future work will focus 17, no. 6, 2013. on taking the temporal nature of the network into account. [9] Abramovici M., Neubach M., Fathi M. and Holland A., Expanding to dynamic Bayesian networks or Hidden Competing fusion for bayesian applications, Markov Models will allow for an even more accurate Proceedings of International Conference on Information prediction of trends. The challenge will be to cover the Processing, vol. 8, p. 379, 2008. unknown parameter space that is not directly present in the [10] Lauritzen S. L., and Spiegelhalter D. J., Local training data. Introducing Gaussian probabilities to closer model the values produced by different sensors is another computations with probabilities on graphical structures possible direction for future work. A third extension will and their application to expert systems, Journal of the handle tackling computational challenges that arise when Royal Statistical Society. Series B, 1988. the network sizes increases, those may be solved by creating [11] Hall M., Frank E., Holmes G., Pfahringer B., a more hierarchical network structure. Reutemann P., Witte I. H., The WEKA Data Mining Software: An Update, SIGKDD Explorations, vol. 11, References no 1, 2009. [1] Heckerman D., Bayesian networks for data mining, Data [12] Fayyad U. and Irani K., Multi-interval discretization of mining and knowledge discovery, vol. 1, no. 1, 1997. continuous-valued attributes for classification learning, [2] Rudin C., Waltz D., Anderson N. R., Boulanger A., Chambery, France, 1993. Chow M., Dutta H. Machine learning for the New York [13] Tavčar A., Piltaver R., Zupančič D., Šef T. and Gams City power grid, IEEE Transactions Pattern Analysis M., Modeliranje Navad Uporabnikov Pri Vodenju and Machine Intelligence, vol. 34, no. 2, 2012. Pametnih Hiš, Proceedings of Information Society 2013, [3] Fusco G., Looking for Sustainable Urban Mobility p 114-117, 2013. through Bayesian Networks, Cybergeo: European [14] US Department of Energy, EnergyPlus Energy Simu- Journal of Geography, 2003. lation Software, eere.energy.gov/buildings/energyplus/, accessed 08-2014. 53 INCLUSION OF VISUALLY IMPAIRED IN GRAPHICAL USER INTERFACE DESIGN Mario Konecki Faculty of Organization and Informatics University of Zagreb Pavlinska 2, 42000 Varaždin, Croatia Tel: +385 42 390834 e-mail: mario.konecki@foi.hr ABSTRACT readable in this kind of approach but even moderately large tables are almost impossible to present. Visually impaired programmers have been included into  programming industry since its very beginning and they Graphical charts interpretation – Graphical charts are were able to perform their jobs without difficulties. usually made of several different sub-elements and Graphical user interfaces and point and click method of shapes which cannot be adequately interpreted.  instructing computers have created many difficulties for Robustness issue – Inability to cope with new visually impaired programming professionals. Visually technology and constant software development. impaired have interest in programming just as everyone The same problems have emerged in programming domain else and the means of their inclusion in overall software where various graphical environments have appeared as development process are important issues that need to well as the need to create graphical user interfaces by point be resolved. One of disadvantages for visually impaired and click method since textual description of graphical is the lack of assistive technology that would enable elements for visually impaired was too complicated and them to design and create graphical user interfaces. In virtually impossible in practice [7]. And although this this paper the GUIDL (Graphical User Interface problem was not so prominent in the area of web Description Language) system that is aimed to resolve programming since its textual coding nature, it was very real the mentioned issues is presented and discussed. in the domain of classic desktop programming [6]. Inclusion of visually impaired as equals into all aspects of 1 INTRODUCTION social and business life remains important issue and enabling visually impaired to design graphical user Inclusion of visually impaired in the world of computers interfaces is one of its aspects. The interest of visually and programming has been present from the beginning of impaired for programming today is present and actual [1, computer mass usage [5]. Visually impaired have been able 12]. There are over 130 blind programmers registered at the to use computers and perform various programming tasks American Foundation of the blind programmers and by using assistive technology in the form of various text-to- programming is stated as potentially promising carrier speech synthesizers of which the most well-known are opportunity for visually impaired in Europe [2, 11]. JAWS, HAL Screen Reader, COBRA, Window Eyes and Inclusion of visually impaired into overall software Easy Web Browsing [7]. However, the graphical revolution development process which includes the design of graphical in the world of computers has made using computers for user interfaces is an actual and important issue. Its solution visually impaired much more difficult since text-to-speech in a form of GUIDL (Graphical User Interface Description synthesizers were not able to represent the context and Language) system is proposed and described in the rest of organization of graphical screens. Some problems that this paper. existing screen reading technologies came across are [7]:  Interpretation of images – Screen readers are not able 2 POSSIBLE APPROACHES TOWARDS SOLUTION to adequately interpret images. Only properties and descriptions of images can be presented. In order to solve the problem of inclusion of visually  Graphical layout and context – Screen readers read impaired into the process of overall software development information in linear way that is not sufficient to and to enable them to design and create graphical user interpret complex graphical user interfaces and screen interfaces several possible approaches can be taken [7]: organization.  Interpreters of specific graphical elements and  Reading of data tables – Because of linear way of attributes of every development environment could be reading information small tables are suitable and created 54  Audio support for creation of graphical elements could programming language. Conceptual model of GUIDL be incorporated into programming environments system is shown in Figure 1 [8].  A specific scripting language for every programming technology and environment could be developed All mentioned approaches are time consuming and specific to particular programming language and environment. In order to provide a more universal solution several Wrapper/ Final requirements must be satisfied [8]: mediator 1 UI  Easy usage: system has to be simple and easy to use so code 1 it can be used by programmers but also by designers and other interested computer users. GUIDL  Intuitive, simple and understandable syntax: system’s language language that will be used to describe the graphical elements has to be intuitive, simple to use and easy to understand. 2 TEXT NORMALI Wr ZAT appe I r O / N AND GRAPH Fina E l ME-TO-  Technology independence: system and its language for PHONEME CONVERS mediatIO or N 2 UI description of graphical user interfaces has to be code 2 applicable to various programming languages and development environments, not to be technology specific.  Extensibility: system and its language have to be to be Figure 1: Conceptual model of GUIDL system extensible so they can include support for new programming languages and corresponding In GUIDL system visually impaired programmers start the development environments as well as new graphical development of programming language specific graphical element and attributes. user interface by defining the interface layout in GUIDL Mentioned requirements have been evaluated through using all assistive built-in concepts. After the GUIDL code conducted research that confirmed stated requirements and has been written, the GUIDL lexer performs lexical added some new requirements with the requirement for analysis. Lexical analysis uses scanner that reads the code proper documentation being the most frequently mentioned. character by character and transfers it to lexer which checks whether received characters form an array that can be 3 GUIDL SYSTEM identified as an acceptable string or token. The set of In order to provide a more general solution a GUIDL acceptable strings or tokens are defined in GUIDL context- (Graphical User Interface Description Language) system free grammar [9] G = (V, Σ, R, S) where V is the finite set of nonterminal symbols, Σ is the finite set of terminal has been developed with GUIDL language as its core part. GUIDL language enables visually impaired to define all symbols, R is the set of substitutions or production rules of from A → α where A is graphical elements in one place using language that is some nonterminal symbol and α is a simple and has several assistive concepts such as: string over V ∪ Σ.  Predefined gradual sizes of forms S represents the start symbol which is nonterminal. Every  Predefined gradual sizes of graphical elements string x  (V ∪ Σ)* which has the form of yAz can be  turned into yαz by using production rule A → α that Predefined width/height attribute values  substitutes A with α. Set of terminals defines the language's Division of forms into quadrants  alphabet and Σ  V. The set of possible tokens or the Possibility to position graphical elements into one of language itself is defined as L(G) = {q  * : S  form’s quadrants G* q} or  all strings of finite length that are composed of zero or more Possibility to define the position offset of forms  symbols from  in a way that particular string can be Possibility to define the position offset of graphical generated from start symbol by using zero or more steps elements  which are defined by production rules. Detection of problems with position of graphical The set of tokens produced by lexer is processed by GUIDL elements (graphical element out of form boundaries)  parser that compares the order of the tokens against defined Automatic correction of problems with form’s grammar rules and in this way conducts the syntax dimension and position (form out of screen boundaries) validation of written code. After the syntax is validated the In this way visually impaired are able to define entire user parser creates corresponding syntax tree from which the interface in just one place and one technology and GUIDL GUIDL generator generates the final graphical user system then enables them to translate that interface into interface code for specific programming language and desired programming technology format which can then be environment. Described process is shown in Figure 2 [8]. included into native programming environment of chosen 55 GUIDL code x = a + b + c Lexical analysis GUIDL lexer Tokens id equals id plus id plus id Syntactic analysis GUIDL parser Syntactic tree GUI c ode generator Specific GUI code Figure 2: Steps in using GUIDL system 3.1 GUIDL syntax 4 GUIDL AS ASSISTIVE TECHNOLOGY GUIDL syntax has been designed to be simple and easy to GUIDL system and its language are designed to be simple use in order to enable visually impaired to perform their and understandable. Its main purpose is to provide visually design tasks in quick and efficient manner. Its form is impaired an assistive technology that would enable them inspired by the simplicity of BASIC (Beginner's All-purpose easier creation of designed graphical user interfaces. There Symbolic Instruction Code) syntax which has been created are several models that support development of assistive for beginners and for learning purposes. Partial GUIDL technology and one of them is CAT (Comprehensive grammar in EBNF [10] form is given below. Assistive Technology) [4] that was used in development of GUIDL system. CAT model is hierarchically structured project = projectcode, controlname, form; with four aspects as is shown in Figure 3 [4]. projectcode = 'Project ' | 'project '; form = formcode, controlname, formattributes, [controldeclarations], formend; Context formcode = 'Frm ' | 'frm '; formend = ('End' | 'end'), [eol]; controlname = qoute, word, qoute, eol; Comprehensive Person formattributes = frmcommonattributes, Assistive windowstateattribute, {colorattribute}; frmcommonattributes = textattribute, Technology Activities frmrestcommonattributes; model frmrestcommonattributes = frmsizeattribute, locationattribute; Assistive frmsizeattribute = sizecode, (frmsize | frmwidth, ws ,frmheight), eol; technology locationattribute = locationcode, xposition, ws, yposition, eol; Figure 3: CAT model frmsize = 'frmsize1' | 'frmsize2' | 'frmsize3' locationcode = 'Location = ' | 'Location=' | 'location = ' | In the case of GUIDL system development, the context is 'location='; composed of several aspects such as: teams and working xposition = 'left' | 'center' | 'right'; environment in which visually impaired programmers work, yposition = 'top' | 'middle' | 'bottom'; overall labor market, the attitude towards visually impaired 56 as programmers, government measures to aid visually elements. Possible approaches towards solution of impaired in getting jobs, etc. Person that uses GUIDL is mentioned problems have been presented in this paper visually impaired person that wants to be a part of overall along with GUIDL system as assistive technology that is software development process and wants to work as a aimed at including visually impaired in graphical user programmer of equal opportunities. Activities that are interface design as an integral part of overall software supported by GUIDL system are activities of quicker and development process. easier development of graphical user interfaces that will Evaluation of GUIDL system has shown that it is suitable as enable visually impaired to be included as equals in the an assistive technology and that it enables visually impaired work of the development team that they belong to. to perform the actions of graphical interface design in an GUIDL system has a meaning of assistive technology [3] in easier and more suitable manner. Adding new features and a way that it is designed to overcome the obstacle of concepts to GUIDL system will be a part of future work. designing and creating graphical user interfaces in graphical point and click development environments. GUIDL system References isn't build to replace development environments and to [1] Alexander, S. Blind programmers face an uncertain provide a specific isolated technology, it is designed to future, available at include visually impaired in actual programming http://www.cnn.com/TECH/computing/9811/06/blindpr technologies’ parts in which they had the most difficulties. og.idg/index.html, accessed 14th August 2014, 1998. Visually impaired programmers start their development in a [2] Cattani, R. The employment of blind and partially- native programming environment where they set a project. sighted persons in Italy: A challenging issue in a Then they use GUIDL to design graphical user interfaces changing economy and society, available at which are then included in already made project where http://www.euroblind.org/media/employment/employme visually impaired programmers continue with development nt_Italy.doc, accessed: 25th March 2011. and writing of program code which is something that they [3] Francioni, J. M.; Smith, A. C. Computer Science were always been able to do well by using another assistive Accessibility for Students with Visual Disabilities. ACM technology in a form of text-to-speech synthesizers. SIGCSE Bulletin, 34(1):91-95, 2002. In this way visually impaired become included in the design [4] Hersh, M. A.; Johnson, M. A. Assistive Technology for process and creation of graphical user interfaces as well as Visually Impaired and Blind People. Springer, 2008. in other segments of overall software development. [5] Hodson, B. Sixties Ushers in Program To Train Blind Programmers. Computer World Canada 11, 2004. 4.1 GUIDL system in practice [6] Konecki, M.; Kudelić, R.; Radošević, D. Challenges of GUIDL system has been tested on 47 participants that were the blind programmers. In Proceedings of the 21st given the GUIDL prototype along with instructions and Central European Conference on Information and examples of its use. All participants were given several Intelligent Systems, pages 473–476, 2010. practical programming tasks through which they had to [7] Konecki, M.; Lovrenčić, A.; Kudelić, R. Making evaluate whether the GUIDL system will provide an Programming Accessible to the Blinds. In Proceedings efficient assistive role in their programs development of the 34th International Convention. Croatian Society activities. The evaluation of GUIDL system has shown that for Information and Communication Technology, it indeed serves well as assistive technology. All performed Electronics and Microelectronics – MIPRO, pp. 820– tasks have been reported as easier and quicker when using 824, 2011. GUIDL system than when using purely native technology. [8] Konecki, M.: A New Approach Towards Visual GUIDL system has also been reported as a suitable mean of Programming for the Blinds. In Proceedings of the 35th inclusion of visually impaired into activities of overall International Convention on Information and software development that includes design of graphical user Communication Technology, Electronics and interfaces. Another important aspect of GUIDL as assistive Microelectronics - MIPRO, pages 935–940, 2012. technology is that it enables visually impaired to work in [9] Lewis, H. R.; Papadimitriou, C. Elements of the Theory actual technologies rather than having isolated and of Computation. Prentice Hall, Inc., 1981. specialized system. [10] Information technology – Syntactic metalanguage – Extended BNF. ISO/IEC 14977. Geneva, Switzerland: 5 CONCLUSION ISO, 1996. Visually impaired programmers have been a part of [11] bfi Steiermark, European Labour Market Report, computer revolution since its very beginning. Graphical available at user interfaces and occurrence of point and click http://eurochance.brailcom.org/download/labour- development environments have left visually impaired in market-report.pdf, accessed: 25th March 2011. difficult position since existing assistive technology in the [12] Blind Programming Project in NetBeans IDE, available form of text-to-speech synthesizers could not cope well at http://netbeans.dzone.com/videos/netbeans-ide-for- enough with rapid development and new graphical the-blind/, 2010, accessed: 20th August 2014. 57 MINING TELEMONITORING DATA FROM CONGESTIVE-HEART-FAILURE PATIENTS Mitja Luštrek1,2, Maja Somrak1,2 1Jožef Stefan Institute, Department of Intelligent Systems 2Jožef Stefan International Postgraduate School e-mail: {mitja.lustrek, maja.somrak}@ijs.si ABSTRACT observational study was carried out in the project with the intention to generate such knowledge. This paper presents The Chiron project carried out an observational study an initial analysis of the data gathered in this study. in which congestive-heart-failure patients were telemonitored in two countries. Data from 1,068 2 DATA FROM THE CHIRON STUDY recording days of 25 patients were gathered, consisting of 15 dynamic parameters (measured daily or 2.1 Data gathering and description continuously) and 49 static parameters (measured once The data analyzed in this paper were gathered in the period or a few times during the study). The features derived from May 2013 to May 2014. The whole study included 38 from these parameters were mined for their association CHF patients: 19 from the United Kingdom and 19 from with the feeling of good/bad health. The findings mostly Italy. However, some of the data were incomplete, so only correspond to the current medical knowledge, although the data of 12 patients from the UK and 13 patients from some may represent new insights. Italy were included in the analysis. These 25 patients together provided a total of 1,068 usable recording days. 1 INTRODUCTION The data consists of 64 parameters carefully selected based Telemonitoring of patients with chronic diseases is on their relevance to CHF [7]. becoming technically increasingly feasible, but benefits for The initial measurements of 49 static parameters were the patients are not always apparent, nor is it clear how to taken for each of the patients at the beginning of the study. make the most of the data obtained this way. In the case of This data includes general patient information (age, gender, heart failure, two systematic literature reviews showed BMI, waist-to-hip ratio, smoking, etc.), their current medical lower mortality resulting from telemonitoring [1][2], but in treatments (beta blockers, anti-coagulants, ACE inhibitors, the trials they reviewed, telemonitoring was mostly etc.), related health conditions (arrhythmias, hypertension, compared with conventional care worse than what is offered diabetes, etc.) and the results of a blood analysis today. Conversely, two large recent trials showed no benefit (hemoglobin, lymphocytes, LDL/HDL cholesterol, blood from telemonitoring [3]. However, the telemonitoring in glucose, Na and K levels, etc.). Some of these measurements these two trials was not very advanced – the monitored were repeated periodically every few weeks to provide up- parameters were limited and no intelligent computer to-date information. However, the exact period varied from analysis was involved. We can conclude from this that as the patient to patient and roughly half of the patients only had conventional care improved, so should telemonitoring. One the measurements taken at the beginning of the study. way to do so is by using intelligent computer methods on the During the study, the patients were wearing vital-signs gathered data, both to save the time of the medical personnel monitoring equipment [5] for several hours each day. The who would otherwise have to look at all the data themselves, equipment consisted of an ECG device, two accelerometers and to uncover previously unknown relations in the data. places on the chest and thigh, a body-temperature and a This paper describes the mining of telemonitoring data humidity sensor. The ECG recordings were subsequently from congestive-heart-failure (CHF) patients gathered in the analyzed to extract the physiological parameters related to Chiron project [4]. The objective of this project was to the heart rhythm: heart rate, QRS interval, QT interval, PR develop a framework for personalized health management interval, T wave amplitude and R wave amplitude. The with a focus on telemonitoring. The Chiron patients were accelerometers continuously provided the patient’s activity equipped with a wearable ECG, activity, body-temperature, and energy-expenditure estimation. The temperature and sweating and sensors. In addition, their blood pressure, humidity sensors provided the measurements of the skin blood oxygen saturation, weight, and ambient temperature temperature and sweating index in five-minute intervals. and humidity were measured [5]. The data gathered this way The patients were also provided with a mobile application was fed into a decision-support system, whose objective was for generating weekly and daily reports. The patients to estimate the health risk of the patients [6]. However, since reported their overall feeling of health with respect to the there is not enough knowledge on how to associate the previous day on a daily basis (feeling much worse than values of the various measured parameters with the risk, an yesterday, worse, the same, better or much better), and 58 answered 13 questions about their health and well-being on • Much worse vs. much better ( MW-MB) a weekly basis. In addition, they reported measurements of • Much worse or worse three times in a row vs. much systolic and diastolic blood pressure, body mass, blood better or better three times in a row ( MW3-MB3) oxygen saturation, and ambient temperature and humidity. • Much worse or worse vs. much better or better ( MWW- These – together with the continuously monitored MBB) parameters – are labeled dynamic in Section 3. • Much worse vs. everything else ( MW-E) The study also intended to gather data about hospital • Much worse or worse three times in a row vs. admissions and deaths, but no such events occurred during everything else ( MW3-E) the study period. Therefore we decided to use the patients’ • Much worse or worse vs. everything else ( MWW-E) self-reports of health instead. The analysis in this paper is based on the daily questions about the feeling of health. The majority of the data instances have the class ‘feeling the same as yesterday’, while very few instances have ‘feeling 2.2 Data preprocessing much better’ or ‘feeling much worse’. Because of this, the The ECG and accelerometer data recordings required the first three classes result in discarding the majority of the most attention when preprocessing the data prior to the data instances (only 69, 101 or 285 instances remain), while the mining. These two types of recordings also generated the last three use all 1,086 of them. Since classes are vast majority of all the gathered data. imbalanced, particularly in the last three cases, we used The ECG signal was already processed with the Falcon cost-sensitive classification, with the costs of algorithm [5], producing an output where each heart beat is misclassifications compensating for the imbalances. described with an 11-tuple. Because the tuples were not explicitly separated and some of them are incomplete, it was 3 MINING THE DATA important to distinguish between them in order to extract the Since the number of combinations of data-mining specified parameters. We used R-peaks in the ECG signal to algorithms, features and classes is huge, we designed a identify distinct tuples. Additionally, a lot of the data was three-step data-mining procedure (described in detail in corrupt or missing, so those parts had to be removed. Sections 3.1–3.3): Similar problems occurred when processing the 1. Selection of algorithms that classify the data with a high accelerometer data. It was not possible to extract the accuracy and yield understandable models information about the activity and energy expenditure if a recording of any one of the axes of either of the two sensors 2. Using the selected algorithms, selection of features that was missing. If a patient forgot to wear both sensors, or one classify the data with a high accuracy and are had an empty battery, the data thus had to be discarded. understandable Finally, some data was not uploaded successfully to the 3. Using the selected algorithms and features, selection of servers due do connection problems, and some data are classes that result in accurate models missing as a result of inconsistent patients’ behavior. At the end of these three steps, we ended up with a number All of the parameters that were measured continuously of interesting models, some of which are presented in were further separated by the main activities of the day: Section 3.4. during lying, sitting and moving separately (resulting in features labeled per_act in Section 3) or during all the 3.1 Selection of algorithms activities together ( all_act). The ratios of the durations of In the first step we used MW3-MB3 classes and the avg these three activities were calculated for each day. For every subset of dynamic all_act features. We compared several parameter that was measured continuously or multiple times algorithms from the Weka suite [8] shown in Table 1. We per day, the average value ( avg) and standard deviation ( sd) selected the underlined algorithms for the experiments in were calculated; the calculations were done for separate Sections 3.2 and 3.3 due to their accuracy and in the case of activities and for the whole day. JRip to have another understandable algorithm. The key value whose association with the other Table 1: Comparison of data-mining algorithms monitored parameters we study in this paper – the overall feeling of health – was reported by the patients relatively to Algorithm Accuracy the previous day. Since the value is not absolute (e.g., Random Forest 79.3 % feeling well) but relative (e.g., feeling better or worse than Naive Bayes 77.4 % yesterday), it is associated with the measurements of both J48 76.3 % the current and the previous day. Because of that we SVM, Puk kernel 74.5 % introduced features that represent changes of the parameters’ SVM, linear kernel 74.2 % values with respect to the previous day ( chg). Again, the SGD 73.8 % calculations were done for separate activities and for the Multilayer Perceptron 73.2 % whole day. JRip 71.9 % For the purpose of data mining, classes were assigned to kNN, k = 1 60.9 % the data. If each of the five distinct feelings of health kNN, k = 2 56.2 % corresponds to one class, the differences between them are kNN, k = 3 47.8 % too small. Therefore we decided to have only two classes: SVM, RBF kernel 40.1 % 59 3.2 Selection of features 3.3 Selection of classes We first compared predefined features sets described in We compared the accuracies of different classes on all the Section 2. Since the number of combinations is large, we algorithms selected in Section 3.1 and all the features proceeded in several sub-steps. First, we compared subsets selected in Section 3.2. In Table 4 we report the F-measure of dynamic all_act features, finding that only avg and avg + for the Random Forest algorithm (most accurate overall), chg subsets performed better than the rest. The results are averaged over all the features. The F-measure was chosen shown in the first segment of Table 2 with the highest because of the class imbalance, particularly for the three ‘vs. accuracy for each algorithm in bold. Second, we added everything else’ pairs of classes. One can see that MW3- per_act features to these two subsets of features, finding the MB3 performed best, probably because it strikes the best extended features worse than all_act features alone (second balance between the difference between the two classes in segment of Table 2). And third, we combined these two the pair, and the number of instances in the dataset. MW- subsets of features with static features, finding them best of MB may have too few features, while in the other cases the all (third segment of Table 2). However, given the small difference between the two classes is too small. number of patients, it is likely that the static features Table 4: Comparison of classes identified individual patients instead of taking into account their general characteristics. Because of that we retained all Classes MW- MW3- MWW MW- MW3- MWW- the underlined features for experiments in Section 3.3. MB MB3 -MBB E E E F-measure 0.77 0.79 0.66 0.55 0.56 0.61 Table 2: Comparison of predefined feature sets Instances 69 101 285 1,068 1,068 1,068 Algorithm m 3.4 Interesting models e , o es d iv M ip n Classification models were built with the J48 and JRip a y k a rest V u 8 R 4 a o Features N B S P J J R F algorithms (being the most understandable of the five Dynamic, all_act, avg + chg 75.5 80.0 70.6 76.9 80.3 selected in Section 3.1) on all the features selected in Dynamic, all_act, avg 77.4 74.5 71.9 76.3 79.3 Section 3.2. Two examples are presented in Figure 1 and Dynamic, all_act, avg + sd 75.3 73.1 70.9 73.3 77.7 Figure 2. They show that a high heart rate Dynamic, all_act, avg + chg + sd 74.0 78.7 70.3 75.2 78.3 ( HR_avg_all_activities in the figures) and short QRS Dynamic, all_act, chg + sd 67.1 78.6 64.6 64.9 71.9 interval ( QRS_avg_all_activities, a feature of the ECG Dynamic, all_act, chg 62.1 71.2 55.5 64.8 64.4 signal) are associated with the feeling of good health, which Dynamic, all_act, sd 58.2 65.4 63.0 64.6 66.9 corresponds to the existing medical knowledge. Increased Dynamic, all_act + per_act, avg 77.0 72.5 71.6 75.7 78.4 Dynamic, all_act + per_act, avg + chg 73.4 71.8 71.0 76.7 79.1 weight ( DRWChg) is associated with bad health, which Dynamic + static, all_act, avg 77.5 79.2 75.5 76.4 79.3 makes sense, since it often signifies excess fluid retention, a Dynamic + static, all_act, avg + chg 77.8 80.4 77.0 79.6 80.5 common problem of CHF patients. Low humidity ( HumA) and decrease in humidity ( HumAChg) are associated with We also tested automatic feature selection methods from the good health, which matches the medical opinion that CHF Weka suite. None of the methods performed well on its patients often badly tolerate humid weather, although there own, so we used the features selected by at least two is little hard evidence for this. Oxygen saturation ( DRS) methods out of the following: Correlation-based Feature below 97 % is associated with bad health in the second Subset, Gain Ratio, ReliefF, Symmetrical Uncertainty and model, which is normal, since the saturation in healthy Wrapper (the end result of the Wrapper approach was the individuals is 96 % – 100 %. Finally, the first model union of features selected when each of the five algorithms associates high systolic blood pressure ( SBP) and the second selected in Section 3.1 were used). As the starting point, we low diastolic blood pressure ( DBP) with good health. This is used all features, all dynamic features, and avg + chg subset expected in CHF patients, since their hearts have problems of all_act dynamic features. The results in Table 3 show that pumping out enough blood (low systolic blood pressure) as the first and third of these starting points resulted in the best well accepting enough blood (high diastolic blood pressure). models obtained so far, although we retained all the underlined features for the experiments in Section 3.3. Table 3: Comparison of automatic feature selection Algorithm m e , o es d iv M ip n a y k a rest V u 8 R 4 a o Features N B S P J J R F All features, FS 75.5 80.0 70.6 76.9 80.3 Dynamic, all_act, avg + chg, FS 77.4 74.5 71.9 76.3 79.3 Dynamic + static, all_act, avg + chg 75.3 73.1 70.9 73.3 77.7 Dynamic, all_act, avg + chg 74.0 78.7 70.3 75.2 78.3 Dynamic, all_act, avg 67.1 78.6 64.6 64.9 71.9 Figure 1: J48 classification tree on the avg subset of all_act Dynamic, FS 62.1 71.2 55.5 64.8 64.4 dynamic features 60 heart failure patients. Journal of the American College of Cardiology 54, 2009, pp. 1683–1694. [2] S. C. Inglis, R. A. Clark, F. A. McAlister, S. Stewart, J. G. Cleland. Which components of heart failure programmes are effective? A systematic review and meta-analysis of the outcomes of structured telephone support or telemonitoring as the primary component of chronic heart failure management in 8323 patients: abridged Cochrane Review. European Journal of Heart Failure 13, 2011, pp. 1028–1040. [3] C. Sousa, S. Leite, R. Lagido, L. Ferreira, J. Silva- Figure 2: J48 classification tree on the avg + chg subset of Cardoso, M. J. Maciel. Telemonitoring in heart failure: all_act dynamic features A state-of-the-art review. Revista Portuguesa de Cardiologia 33 (4), pp. 229–239. 4 CONCLUSION [4] Chiron project. http://www.chiron-project.eu/. [5] E. Mazomenos, J. M. Rodríguez, C. Cavero, G. Telemonitoring can provide huge quantities of medically Tartarisco, G. Pioggia, B. Cvetković, S. Kozina, H relevant data, which has the potential to revolutionize the Gjoreski, M. Luštrek, H. Solar, D. Marinčič, J. Lampe, care of patients with chronic diseases. However, before this S. Bonfiglio, K. Maharatna. Case Studies. In System can happen, the data must be properly interpreted, for which Design for Remote Healthcare, 2014, pp. 277–332. the current knowledge is not yet entirely adequate. This [6] M. Luštrek, B. Cvetković, M. Bordone, E. Soudah, C. paper presents the data gathered by telemonitoring of CHF Cavero, J. M. Rodríguez, A. Moreno, A. Brasaola, P. E. patients, and the first attempt to uncover interesting relations Puddu. Supporting clinical professionals in decision- in the data by data mining. A systematic procedure for the making for patients with chronic diseases. Proc. IS selection of appropriate data-mining algorithms, features 2013, pp. 126–129. and classes was designed, whose output were a number of [7] P. E. Puddu, J. M. Morgan, C. Torromeo, N. Curzen, models associating telemonitored parameters with the M. Schiariti, S. Bonfiglio. A clinical observational feeling of good or bad health. The models correspond quite study in the Chiron project: Rationale and expected well to the current medical knowledge, which demonstrates results. In Impact Analysis of Solutions for Chronic the validity of our approach. Disease Prevention and Management, 2012, pp. 74–82. In the future, we need to solve the technical difficulties [8] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. with extracting the ECG parameters and compute some new Reutemann, I. H. Witten. The WEKA data mining features that may be relevant (e.g., QT interval prolongation, software: An update. SIGKDD Explorations 11 (1), a feature of the ECG signal that is known to be associated 2009, pp. 10–18. with cardiovascular problems). Furthermore, the models resulting from data mining must be carefully examined by cardiologists, both the models presented in the paper and others. Those that contain hitherto unknown relations may be even more important than those that correspond to the current medical knowledge, since the relations in them may yield new and important insights. Finally, it would be desirable to study data that contain events such as hospital admissions or even deaths, since the findings on such data would be more reliable than on data that only contains self- reported feeling of health. However, another observational study would be needed for that, which is a difficult proposition that would require substantial funding. Acknowledgement This research described in this paper was carried out in the Chiron project, which was co-funded by the ARTEMIS Joint Undertaking (grant agreement # 2009-1-100228) and by national authorities. References [1] C. Klersy, A. De Silvestri, G. Gabutti, F. Regoli, A. Auricchio. A meta-analysis of remote monitoring of 61 APPROXIMATING DEX UTILITY FUNCTIONS WITH METHODS UTA AND ACUTA Matej Mihelč ić 1,3, Marko Bohanec2 1 Ruđer Bošković Institute, Division of Electronics, Laboratory for Information Systems, Croatia 2 Jožef Stefan Institute, Department of Knowledge Technologies, Jamova 39, Ljubljana, Slovenia 3 Jožef Stefan International Postgraduate School, Jamova 39, Ljubljana, Slovenia e-mail matej.mihelcic@irb.hr, marko.bohanec@ijs.si ABSTRACT numerical ones in some suitable way: first, the newly obtained numerical evaluations would facilitate an easy DEX is a qualitative multi-criteria decision analysis ranking and comparison of alternatives, especially those that (MCDA) method, aimed at supporting decision makers are assigned the same class by DEX; second, the sheer form in evaluating and choosing decision alternatives. We of numerical functions may tell us more about the properties present results of a preliminary study in which we of underlying DEX functions, which make them useful for experimentally assessed the performance of two well- verification, representation and justification of DEX models. known MCDA methods UTA and ACUTA to There have already been several attempts to approximate approximate qualitative DEX utility functions with DEX utility functions with numeric ones for various piecewise-linear marginal utility functions. This is seen purposes. A linear approximation method is commonly used as a way to improve the sensitivity of qualitative models in DEX to assess the importance (weights) of criteria [4, 5]. and provide a better insight in DEX utility functions. An early method for ranking of alternatives and improving The results indicate that the approach is in principle the sensitivity of evaluation has been proposed in [6] and is feasible, but at this stage suffers from problems of now referred to as QQ [7]. Recently, extensive research has convergence, insufficient sensitivity and inappropriate been carried out to approximate DEX functions with copulas handling of symmetric functions. [7, 8]. However, no known attempts have been made so far to approximate DEX functions with piecewise-linear 1 INTRODUCTION marginal utility functions, as provided by UTA. Multi criteria decision analysis (MCDA) [1] is an approach The aim of this study was to experimentally assess the concerned with structuring and solving decision problems performance of UTA and its variant, ACUTA [11], on a involving multiple criteria. MCDA provides a number of collection of typical DEX functions. The experiments were methods [2] to create a decision model from information carried out using two software tools: DEXi [4] to develop provided by the decision maker. This information can be DEX functions and Decision Deck [12] to run (AC)UTA. given in many ways, for instance by constructing evaluation functions directly, by providing parameters (such as criteria 2 METHODS AND TOOLS weights) to some predefined functions, by giving examples 2.1 DEX and DEXi of decisions, or by pairwise comparison of a subset of decision alternatives. Methods also differ in the DEX [3] is a qualitative MCDA method for the evaluation representation of this information (e.g., quantitative or and analysis of decision alternatives, and is implemented in qualitative) and their primary aim (choosing the best the software DEXi (http://kt.ijs.si/MarkoBohanec/dexi.html) alternative, ranking several alternatives, classifying [4]. In DEX, a decision model consists of hierarchically alternatives into predefined discrete classes, etc.). structured attributes: the hierarchy represents the Bridging the gap between different MCDA methods is decomposition of the decision problem into smaller sub sometimes highly desirable and may have a great practical problems, and attributes at higher levels of the hierarchy value. In this work, we try to combine two MCDA methods: depend on those on lower levels. Figure 1 (left) shows an DEX and UTA. DEX [3] is a qualitative method; it employs example of a tree of attributes for evaluating cars [4]. discrete attributes and discrete utility functions defined in a In the context of this paper, it is important to understand point-by-point way (see section 2.1). This makes DEX that all attributes in DEX models are qualitative and can suitable for classifying decision alternatives into discrete take values represented by words; for instance, the attribute classes. On the other hand, UTA [9, 10] is a quantitative PRICE in Figure 1 can take the values high, medium and method that constructs numerical additive utility functions low. Furthermore, the aggregation of attributes at some level from a provided subset of alternatives (see section 2.2). in the tree is defined by decision tables that consist of This work is motivated by the expectation that DEX’s elementary decision rules. For example, the table in Figure 1 functionality would have been substantially enhanced if we (right) defines the aggregation of two lower-level attributes were able to convert its discrete utility functions to PRICE and TECH.CHAR into the higher-level attribute 62 CAR: the values of CAR are specified for all combinations 3. YW: defined on the same space as YM, it represents an of values of PRICE and TECH.CHAR. Essentially, this asymmetric DEX function defined with weights [4]; the means that utility functions in DEX are discrete and defined weights assigned to the three arguments are 60%, 30% in a point-by-point way. This is illustrated in Figure 2, and 10%, respectively. which graphically represents the same function as in Figure All these functions are defined completely for all 1, so that each row of Figure 1 is represented by a dot in combinations of values of their arguments. Figure 2. The connecting lines are used only for 2.2 UTA and ACUTA visualization and are not part of function definition. The UTA method (UTilité Additive) [9,10] is used to assess utility functions which aggregate multiple criteria in a composite criterion used to rank the alternatives. Similarly as DEX, it uses a subjective ranking on a subset of the alternatives. On this basis, it creates piecewise-linear marginal utility functions. For a set of alternatives ܣ , ܽ ∈ ܣ , numerical criteria ݃ ൌ ሺ݃ଵ, ݃ଶ, … , ݃௡ሻ, and the utility function ܷሺ݃ሻ ൌ ܷሺ݃ଵ, ݃ଶ, … , ݃௡ሻ, the marginal utility functions ݑ௜ are approximated with: ௃ ௃ ݃௜ሺܽሻ െ ݃௜ ௃ାଵ ௃ ݑ௜ሾ݃௜ሺܽሻሿ ൌ ݑ௜൫݃௜ ൯ ൅ ሻ െ ݑ ሻሿ ݃௃ାଵ ௃ ሾݑ௜ሺ݃௜ ௜ሺ݃௜ ௜ െ ݃௜ Figure 1: A DEX model and a utility function example [4]. It is assumed that each attribute’s values are divided to CAR α ௃ ௃ାଵ ௜ െ 1 equally-sized intervals ሾ݃௜ , ݃௜ ሿ. The marginal utility functions ݑ௜ are constructed by solving the linear programming problem min F ൌ ∑ୟ∈୅ σሺaሻ exc under the constraints: ௡ ෍ ݑ good ௜ሾ݃௜ሺܽሻሿ െ ݑ௜ሾ݃௜ሺܾሻሿ ൅ ߪሺܽሻ െ ߪሺܾሻ ൒ ߜ, ܾܽܲ ௜ୀଵ ௡ acc ෍ ݑ௜ሾ݃௜ሺܽሻሿ െ ݑ௜ሾ݃௜ሺܾሻሿ ൅ ߪሺܽሻ െ ߪሺܾሻ ൌ 0, ܽܫܾ exc ௜ୀଵ unacc low ݑ ௃ାଵ ௃ ௜൫݃௜ ൯ െ ݑ௜൫݃௜ ൯ ൒ ݏ௜, ∀݅ ∈ ሼ1, … , ݊ሽ, ܬ ∈ ሼ1, … , ߙሽ good ௡ medium acc TECH.CHAR. PRICE ෍ ݑ ∗ ௜ሺ݃௜ ሻ ൌ 1 high b a d ௜ୀଵ ݑ ௃ ௜ሺ݃௜∗ሻ ൌ 0, ݑ௜൫݃௜ ൯ ൒ 0, ߪሺܽሻ ൒ 0, Figure 2: Graphical presentation of the CAR decision table. ∀݅ ∈ ሼ1, … , ݊ሽ, ܬ ∈ ሼ1, … , ߙሽ, ∀ܽ ∈ ܣ Here, ߪሺܽሻ denotes potential error relative to the starting Formally, a DEX utility function is defined over a set of utility ܷሾ݃ሺܽሻሿ. ݃∗ and ݃ criteria ݔ ௜ ௜∗ denote the high and low bounds ଵ, ݔଶ, … , ݔ௡, where all criteria are discrete and can of ݃ take values from the corresponding value scales ܦሺݔ ௜ respectively. ܲ and ܫ respectively denote strict ௜ሻ. A preference and indifference relations. utility function ܷ maps ݔ to the higher-level attribute ݕ: ܷ: ܦሺݔ In some cases there can be many utility functions that can ଵሻ ൈ ܦሺݔଶሻ ൈ ⋯ ൈ ܦሺݔ௡ሻ → ܦሺݕሻ ܷ represent the preferences specified. The utility functions are is represented by a decision table that consists of then assessed by means of post-optimality analysis [9]. elementary decision rules, where each rule defines the value The ACUTA method [11] offers an improvement upon of ܷ for some combination of argument values: 〈ݔ UTA. It proceeds by finding an analytic center of the ଵ, ݔଶ, … , ݔ௡〉 → ݕ additive value functions that are compatible with some user For experiments in this study, we used a number of DEX assessments of preferences. In this way, ACUTA solves the utility functions, but in this paper we will present only three: model selection problem present in the UTA method when 1. CAR function, as defined in Figures 1 and 2; there are multiple valid solutions. Similarly as UTA, it 2. YM: defined over three attributes ( ݊ ൌ 3ሻ , all the constructs marginal utility functions by solving a attributes have five values. The function is symmetric constrained optimization problem, see [11] for details. and represents a very common DEX function, which In order to approximate DEX utility functions with behaves as min ሺݔଵ, ݔଶ, ݔଷሻ when any of the arguments (AC)UTA, we mapped qualitative DEX attributes ݔ ∈ ܦሺݔሻ takes the lowest possible value, and as a qualitative to equidistant numerical scales ݃ ൌ ሾ1, |ܦሺݔሻ|ሿ. average of ݔଵ, ݔଶ and ݔଷ otherwise. 63 2.3 Decision Deck and Diviz would allow the method to converge. We used the inverse The Decision Deck (http://www.decision-deck.org/project/) DEX attribute label score as a priori rank for the UTA is a project aimed at developing an open-source MCDA method. As a result, all the alternatives with the same DEXi software platform [12]. Diviz is a software component label score were indifferent for UTA. This required us to developed in Decision Deck aimed at designing, executing take only a small, targeted subset of available a priori ranks. and sharing MCDA methods, algorithms and experiments Overall, the results produced by UTA were poor and did [12]. Diviz enables combining programs that implement not accurately approximate input functions. MCDA algorithms in a modular way and connecting them in The ACUTA method performed much better and the terms of workflows. models were built on the whole domain of the DEX utility functions. However, we did experience convergence issues when using inverse DEX label attribute score as a priori rank for all the alternatives, so we had to take a subset that allowed the method to converge. The convergence error message reported by ACUTA was as follows: Error - failed to converge, due to bad information. Please check your data, rescale the problem, or try with less constraints. Figure 4: ACUTA results for DEX function Car. Figure 3: The ACUTA decision support workflow. Figure 3 shows the workflow used in this study to run ACUTA. The input consists of six datasets. The Criteria file contains names and ID's of the decision criteria, the Alternatives file contains names and ID's of the alternatives, the PerformanceTable file contains the attribute values for each alternative, the AlternativeValues file contains a ranking of a small sample of the alternatives determined by the decision maker (usually called a priori ranking), the PreferenceDirection file indicates preferred optimization direction, and the NumberofSegments file defines the number of segments to which the attribute values are split. The output of the workflow is a rank of alternatives given their attribute values and the a priori ranking. Figure 5: ACUTA results for DEX function YM. 3 RESULTS The results for the Car utility function are shown in Figure 4, where g1 and g2 indicate DEX attributes PRICE Several problems were detected when we attempted to and TECH.CHAR. Both marginal utility functions properly approximate DEXi utility functions with (AC)UTA in Diviz. increase, and in g1 the relations between utility values in First, the standard UTA method could not handle DEX points 1, and 2 appear right, however utility value in point 3 utility functions and returned an error message: is too high. We noticed similar behavior in function g2. Execution terminated, but no result were produced: you probably hit a bug in the service. […] Figure 5 shows results for the DEX function YM. In our In order to get any results, we had to take only a subset of opinion, marginal utility functions approximate YM quite the rules, that is, remove a subset of entries from the UTA well, however they indicate a common problem encountered performance table. in the experiments: YM is symmetric, therefore ACUTA’s The second problem with UTA was setting the a priori marginal functions should be equal to each other, but they alternative ranking (i.e., the target attribute) in a way that 64 are not. In this way, the resulting representation does not In future work, we wish to theoretically and empirically properly capture the symmetricity of the original function. address these issues and alleviate these problems, either by Marginal utility functions in Figure 6 correctly indicate adopting some other method from the rich set of UTA- that YW is asymmetric and, observing function’s maximum related methods [10], by adapting (AC)UTA to specific values, that the attributes g1, g2, and g3 are less and less properties of DEX functions, or by developing entirely new important. However, some sections of these functions are methods. Eventually, the method should be able to deal with almost constant, which does not hold in the original all type of DEX functions, including large ones, function. incompletely defined ones and those defined with distributions of classes. References [1] Ehrgott M., Figueira J.R., Greco S.: Trends in Multiple Criteria Decision Analysis, International Series in Operations Research & Management Science, Vol. 142, New York: Springer, 2010. [2] Figueira J.R., Greco S., Ehrgott M.: Multiple Criteria Decision Analysis: State of the Art Surveys, Boston: Springer, 2005. [3] Bohanec M., Rajkovič V., Bratko I., Zupan B., Žnidaršič M.: DEX methodology: Three decades of qualitative multi-attribute modelling. Informatica 37, Figure 6: ACUTA results for DEX function YW. 49–54, 2013. [4] Bohanec M.: DEXi: Program for Multi-Attribute Decision Making, User's Manual, Version 4.00 4 CONCLUSION . IJS Report DP-11340, Ljubljana: Jožef Stefan Institute, In this preliminary study we tried to approximate several 2013. DEX utility functions by using the basic UTA method and [5] Bohanec M., Zupan B.: A function-decomposition its derivative, ACUTA. In general, the approach turned out method for development of hierarchical multi-attribute to be feasible, producing marginal utility functions from decision models. Decision Support Systems 36, 215– DEX utility functions, which are defined by points in a 233, 2004. discrete multidimensional space. The obtained functions are [6] Bohanec M., Urh B., Rajkovič V.: Evaluating options easy to interpret and do provide useful information about by combined qualitative and quantitative methods. Acta DEX attributes and scales (e.g., numeric utility value for Psychologica 80, 67–89, 1992. each discrete attribute value), and the underlying DEX [7] Mileva-Boshkoska B., Bohanec M.: A method for utility functions (e.g., about relative importance of ranking non-linear qualitative decision preferences attributes). Therefore, the approach is useful for representing using copulas. International Journal of Decision and understanding DEX utility functions: the representation Support System Technology 4(2), 42–58, 2012. consists of a set of additive utility functions that represent [8] Mileva-Boshkoska B., Bohanec M., Boškoski P., Juričić attribute trends and importance’s that cannot be easily Đ.: Copula-based decision support system for quality observed by examining DEX utility functions themselves. ranking in the manufacturing of electronically On the other hand, we encountered several problems with commutated motors. Journal of Intelligent the methods and their implementation. UTA rarely gives any Manufacturing, doi: 10.1007/s10845-013-0781-7, 2013. results on the original DEX functions, and even after [9] Jacquet-Lagreze E., Siskos J.: Assessing a set of tweaking the inputs the results were unsatisfactory. ACUTA additive utility functions for multicriteria decision- performs much better, it can work on the whole domain of making, the UTA method, European Journal of the DEX function, but the a priori rank subset needs to be Operational Research 10(2), 151–164, 1982. carefully chosen in order to avoid convergence problems. [10] Siskos Y., Grigoroudis E., Matsatsinis N.F.: UTA The theoretical reasons for convergence problems of these methods. In: Multiple Criteria Decision Analysis: State methods are still to be determined. of the Art Surveys, 297–343, Boston: Springer, 2005. Marginal utility functions, generated by ACUTA, in [11] Bous G., Fortemps P., Glineur F., Pirlot M.: ACUTA: A principle appropriately represent the marginal behavior of novel method for eliciting additive value functions on DEX attributes, but they exhibit two common problems: the basis of holistic preference statements, European • insufficient sensitivity to changes of attribute values Journal of Operational Research 206(2), 435–444, (some sections of ACUTA functions are (almost) 2010. constant even though the underlying DEX function is [12] Ros J.C.: Introduction to Decision Deck–Diviz: not); Examples and User Guide, Technical report DEIM-RT- • inappropriately representing symmetric DEX functions 11-001, Tarragona: Universitat Rovira i Virgili, 2011. with mutually different marginal utility functions. 65 COMPARING RANDOM FOREST AND GAUSSIAN PROCESS MODELING IN THE GP-DEMO ALGORITHM Miha Mlakar, Tea Tušar, Bogdan Filipič Department of Intelligent Systems, Jožef Stefan Institute and Jožef Stefan International Postgraduate School, Jamova cesta 39, SI-1000 Ljubljana, Slovenia e-mail: {miha.mlakar, tea.tusar, bogdan.filipic}@ijs.si ABSTRACT is accurate, GP-DEMO finds high-quality results with a low number of exact solution evaluations, while if it is not, GP- In surrogate-model-based optimization, the selection of DEMO needs more exact solution evaluations to achieve sim- an appropriate surrogate model is very important. If so- ilar results. lution approximations returned by a surrogate model are Since the accuracy of the surrogate model in surrogate- accurate and with narrow confidence intervals, an algo- model-based optimization is crucial, we decided to apply two rithm using this surrogate model needs less exact solu- different modeling techniques and compare their approxima- tion evaluations to obtain results comparable to an algo- tions to determine which one is more suitable for use in a rithm without surrogate models. In this paper we com- surrogate-model-based algorithm. In addition to Gaussian pare two well known modeling techniques, random forest process (GP) modeling that is used in GP-DEMO, we used (RF) and Gaussian process (GP) modeling. The compar- random forest (RF) for comparison. The reason for choos- ison includes the approximation accuracy and confidence ing RF was the fact that the methodology is well-known and in the approximations (expressed as the confidence inter- that the solutions approximated with this method in addition val width). The results show that GP outperforms RF and to approximated values return also confidence intervals. that it is more suitable for use in a surrogate-model-based The structure of this paper is as follows. In Section 2, we multiobjective evolutionary algorithm. present how the comparison of RF and GP modeling tech- niques was carried out. In Section 3, we discus the results 1 INTRODUCTION gained with both techniques, compare them and determine which technique performs better. Section 4 concludes the pa- One of the most effective ways to solve problems with multi- per with an overview of the work done. ple objectives is to use multiobjective evolutionary algorithms (MOEAs). The MOEAs draw inspiration from optimization 2 COMPARISON OF RF AND GP SURROGATE processes occuring in nature and perform many solution eval- MODELS uations to find high-quality solutions. Due to the high number of solution evaluations the MOEAs are not very suitable for In this section we compare random forest and Gaussian pro- computationally expensive optimization problems where ex- cess modeling techniques used for solution approximations. act solution evaluation takes a lot of time. In order to obtain The aim of the comparison is to determine which of the two the results of such problem more quickly, we usually use sur- techniques is more suitable for use in surrogate-model-based rogate models to approximate the objective functions of the optimization. problem. To test the two techniques, we used relations under uncer- But due to inaccurate approximations, the solution com- tainty to compare their approximated solutions. If two so- parisons can be incorrect, which can result in very good so- lution approximations had overlapping confidence intervals, lutions being discarded. In order to minimize the impact of we, in order to determine their relation, exactly evaluated one incorrect comparisons, we defined the relations under uncer- solution and compared the solutions again. Together with the tainty ([5]) for comparing approximated solutions presented number of these additional exact evaluations, we measured with an approximated value and a confidence interval. By also the number of incorrect solution comparisons and the including the confidence interval in the comparison we were width of the confidence intervals. able to consider this additional information and minimize the In addition to using relations under uncertainty, we also number of incorrect comparisons. compared the approximated solutions with Pareto dominance We used these relations under uncertainty in the algo- relations and measured the number of incorrect comparisons. rithm called Differential Evolution for Multiobjective Opti- With Pareto dominance relations the confidence intervals are mization based on Gaussian Process modeling (GP-DEMO) not included in the comparisons, so in general, the number of [4]. We discovered that the quality of the gained result de- incorrect comparisons hints at the accuracy of the approxima- pends greatly on the surrogate model. If the surrogate model tions. 66 Table 1: Comparison of the relations under uncertainty and Pareto dominance relations for GP modeling on the Poloni problem Relation Solutions used for Number of Incorrect Number of comparisons with Proportion of confidence Confidence type surrogate model comparisons comparisons confidence interval reductions interval reductions [%] interval width 20 1,515 3,635,805 92 26.25 Relations 30 682 3,152,124 80 15.41 under 50 3,940,200 138 1,218,337 31 1.29 uncertainty 100 65 672,384 17 0.012 200 13 549,380 14 0.002 20 367,684 / / 26.25 Pareto 30 159,945 / / 15.41 dominance 50 3,940,200 22,032 / / 1.29 relations 100 2,309 / / 0.012 200 1,219 / / 0.002 Table 2: Comparison of the relations under uncertainty and Pareto dominance relations for GP modeling on the OSY problem Relation Solutions used for Number of Incorrect Number of comparisons with Proportion of confidence Confidence type surrogate model comparisons comparisons confidence interval reductions interval reductions [%] interval width 20 74,181 2,289,682 58 42.81 Relations 30 21,861 1,934,212 49 25.98 under 50 3,940,200 19,342 1,426,775 36 25.05 uncertainty 100 144 712,298 18 0.07 200 152 271,821 7 0.03 20 336,049 / / 42.81 Pareto 30 136,357 / / 25.98 dominance 50 3,940,200 49,790 / / 25.05 relations 100 1,736 / / 0.07 200 1,453 / / 0.03 Table 3: Comparison of the relations under uncertainty and Pareto dominance relations for the GP modeling on the SRN problem Relation Solutions used for Number of Incorrect Number of comparisons with Proportion of confidence Confidence type surrogate model comparisons comparisons confidence interval reductions interval reductions [%] interval width 20 7,407 2,703,783 69 50.03 Relations 30 16 2,338,535 59 0.074 under 50 3,940,200 2 749,258 19 0.099 uncertainty 100 3 359,952 9 0.022 200 11 183,625 5 0.009 20 188,401 / / 50.03 Pareto 30 161 / / 0.074 dominance 50 3,940,200 543 / / 0.099 relations 100 645 / / 0.022 200 648 / / 0.009 Table 4: Comparison of the relations under uncertainty and Pareto dominance relations for RF modeling on the Poloni problem Relation Solutions used for Number of Incorrect Number of comparisons with Proportion of confidence Confidence type surrogate model comparisons comparisons confidence interval reductions interval reductions [%] interval width 20 22,497 3,906,474 99 32.89 Relations 30 5,206 3,937,230 99 31.83 under 50 2,180 3,935,723 99 28.42 3,940,200 uncertainty 100 125 3,930,277 99 23.97 200 4 3,909,386 99 19.76 1,000 2 3,619,402 92 12.11 20 1,021,750 / / 32.89 Pareto 30 965,491 / / 31.83 dominance 50 3,940,200 1,043,216 / / 28.42 relations 100 894,889 / / 23.97 200 733,044 / / 19.76 1,000 379,928 / / 12.11 67 Table 5: Comparison of the relations under uncertainty and Pareto dominance relations for RF modeling on the OSY problem Relation Solutions used for Number of Incorrect Number of comparisons with Proportion of confidence Confidence type surrogate model comparisons comparisons confidence interval reductions interval reductions [%] interval width 20 1 2,663,597 68 842.41 Relations 30 0 2,663,597 68 789.04 under 50 3,940,200 0 2,663,597 68 767.92 uncertainty 100 0 2,663,597 68 720.44 200 0 2,663,597 68 677.79 1,000 0 2,663,597 68 548.19 20 885,416 / / 842.41 Pareto 30 770,439 / / 789.04 dominance 50 3,940,200 810,251 / / 767.92 relations 100 683,578 / / 720.44 200 661,919 / / 677.79 1,000 555,983 / / 548.19 Table 6: Comparison of the relations under uncertainty and Pareto dominance relations for the RF modeling on the SRN problem Relation Solutions used for Number of Incorrect Number of comparisons with Proportion of confidence Confidence type surrogate model comparisons comparisons confidence interval reductions interval reductions [%] interval width 20 18 3,384,351 86 359.51 Relations 30 0 3,385,285 86 350.55 under 50 3,940,200 0 3,385,242 86 308.94 uncertainty 100 0 3,384,910 86 266.55 200 0 3,378,456 86 224.77 1,000 0 3,133,626 79 139.89 20 387,854 / / 359.51 Pareto 30 425,691 / / 350.55 dominance 50 3,940,200 365,606 / / 308.94 relations 100 288,611 / / 266.55 200 216,634 / / 224.77 1,000 136,656 / / 139.89 The solutions selected for testing were not generated ran- lutions. Since building an RF surrogate model is faster than domly, but rather produced by the well-known NSGA-II al- building a GP surrogate model, we, in addition to building gorithm [3]. This ensured that the solution comparisons were surrogate models from 20, 30, 50, 100 and 200 exactly eval- similar to the comparisons performed in evolutionary multi- uated solutions, also built an RF surrogate model from 1000 objective algorithms and thus provided relevant results. exactly evaluated solutions. We tested how much the larger In every generation NSGA-II creates a new set of solu- RF surrogate model built from 1000 exactly evaluated solu- tions, adds them to the current ones and then performs selec- tions increases the accuracy of the approximations. tion on the union to identify the most promising ones. The The NSGA-II parameter values used in the experiments selection procedure includes comparing every solution with were the same for both modeling techniques and for all three all other solutions to determine its dominance status. These problems. They were set as follows: were the comparisons used in our study. • population size: 100, The experiments were performed on three benchmark mul- tiobjective optimization problems. One is the Poloni opti- • number of generations: 100, mization problem [6] and two are from [2], called OSY and SRN. All of them are two-objective problems. • number of runs: 30. For testing purposes we used GP modeling as proposed by [7] and RF modeling as proposed in [1]. For the confidence The results averaged over 30 runs are presented in Tables interval width of the approximation we used two standard de- 1–3 (for GP modeling) and in Tables 4–6 (for RF modeling). viations (2σ), which corresponds to about 95% of the normal distribution of the approximations. The number of trees used 3 DISCUSSION for building RF was 10,000 and the minimum number of ele- ments in the leaves was set to 1. The results gained with both modeling techniques show that, To test the correlation between the surrogate model accu- irrespectively of the accuracy of a surrogate model, using rela- racy and the incorrect comparisons, different models of in- tions under uncertainty reduces the number of incorrect com- creasing accuracy were built—each on larger number of so- parisons. 68 The comparison of the results gained with RF and GP re- based multiobjective optimization. We compared their ap- veals certain differences between the techniques. The main proximation accuracy and width of the confidence intervals. difference is in the width of the confidence intervals. RF sur- The results show that surrogate models built with GP mod- rogate models produce wider confidence intervals. Conse- eling produce more accurate approximations with narrower quently, the number of comparisons with confidence interval confidence intervals. Due to narrower confidence intervals reductions for RF is much higher than for GP. the comparisons of solutions approximated with GP model- In addition to yielding wider confidence intervals, the RF ing require less additional exact solution evaluations. As a surrogate models are also less accurate. Comparing the num- result, we can conclude that GP modeling is more appropriate ber of incorrect comparisons performed with Pareto domi- for use in a surrogate-model-based algorithm than RF. nance relations where the confidence intervals are not con- sidered, we can see that the number of incorrect comparisons References is higher with the RF surrogate models. Another difference is in the correlation between the num- [1] L. Breiman. Random forests. Machine Learning, ber of solutions used for building the surrogate model and the 45(1):5–32, 2001. accuracy of the surrogate model. By increasing the number [2] K. Deb. Multi-Objective Optimization Using Evolution- of solutions used, the RF surrogate models do not improve ary Algorithms. Wiley, New York, 2001. as quickly as the GP models. Even in the cases where 1000 exactly evaluated solutions were used for building the RF sur- [3] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. A rogate models the confidence interval widths were not greatly fast and elitist multiobjective genetic algorithm: NSGA- reduced and the intervals were still much wider than the con- II. IEEE Transactions on Evolutionary Computation, fidence intervals gained with GP models built from 200 solu- 6(2):182–197, 2002. tions. Looking at the number of incorrect comparisons, we can [4] M. Mlakar, D. Petelin, T. Tušar, and B. Filipič. GP- see that by using relations under uncertainty with RF the re- DEMO: Differential evolution for multiobjective opti- sults are slightly better than with GP. The reason for that is in mization based on Gaussian process models. Euro- the fact, that the approximations with RF have relatively wide pean Journal of Operational Research, 2014, doi: confidence intervals which results in high number of confi- 10.1016/j.ejor.2014.04.011. dence interval reductions. Therefore, most solutions have to [5] M. Mlakar, T. Tušar, and B. Filipič. Comparing solu- be exactly evaluated in order to perform the comparisons. So tions under uncertainty in multiobjective optimization. the reason for a lower number of incorrect comparisons is not Mathematical Problems in Engineering, 2014, doi: the higher quality of the surrogate models, but in the fact that 10.1155/2014/817964. more solutions are exactly evaluated and are therefore with- out uncertainty. Since in surrogate-model-based optimization [6] C. Poloni, A. Giurgevich, L. Onesti, and V. Pediroda. Hy- exactly evaluated solutions are typically computationally ex- bridization of a multi-objective genetic algorithm, a neu- pensive, a modeling technique that exactly evaluates most of ral network and a classical optimizer for a complex design the solutions is not very useful. problem in fluid dynamics. Computer Methods in Applied Mechanics and Engineering, 186(2):403–420, 2000. 4 CONCLUSION [7] C. E. Rasmussen and C. Williams. Gaussian Processes In this paper we compared random forest and Gaussian pro- for Machine Learning. MIT Press, Cambridge, MA, cess modeling techniques in the context of surrogate-model- 2006. 69 COMPREHENSIBILITY OF CLASSIFICATION TREES – SURVEY DESIGN Rok Piltaver1,2, Mitja Luštrek2, Matjaž Gams1,2, Sanda Martinč ić – Ipšić 3 Jožef Stefan Institute - Department of Intelligent Systems, Ljubljana, Slovenia 1 Jožef Stefan International Postgraduate School, Ljubljana, Slovenia 2 University of Rijeka - Department of Informatics, Rijeka, Croatia 3 rok.piltaver@ijs.si, mitja.lustrek@ijs.si, matjaz.gams@ijs.si, smarti@inf.uniri.hr ABSTRACT means for inducing definition of comprehensibility metrics that capture fine-grained differences in classifier Comprehensibility is the decisive factor for application comprehensibility and for evaluating the induced metrics. of classifiers in practice. However, most algorithms that User survey based approach, which follows the observation learn comprehensible classifiers use classification model that comprehensibility is in the eye of the beholder [16], is size as a metric that guides the search in the space of all advocated; defining comprehensibility metric directly is not possible classifiers instead of comprehensibility - which possible because it is comprehensibility is ill-defined [13]. is ill-defined. Several surveys have shown that such simple complexity metrics do not correspond well to the 2 REVIEW OF RELATED WORK comprehensibility of classification trees. This paper therefore suggests a classification tree comprehensibility According to [16] comprehensibility measures the "mental survey in order to derive an exhaustive fit" [15] of the classification model, which has two main comprehensibility metrics better reflecting the human drivers: the type of classification model and its size or sense of classifier comprehensibility and obtain new complexity. It is generally accepted that tree and rule based insights about comprehensibility of classification trees. models are the most comprehensible while SVM, ANN and ensembles are in general black box models that can be 1 INTORDUCTION hardly interpreted by users [8, 16, 20]; however there are domain and user specific exceptions from this rule of thumb. Comprehensibility of data mining models, also termed For a given classification model, the comprehensibility interpretability [15] or understandability [1], is the ability to generally decreases with the size [2]. This principle is understand the output of induction algorithm [14]. Its motivated by Occam's razor, which prefers simpler models importance has been stressed since the early days of over more complex ones [6]. Furthermore, a rule based machine learning research [17, 19]. Kodratoff even reports model with few long clauses is harder to understand than that it is the decisive factor when machine learning one with shorter clauses, even if the models are of the same approaches are applied in industry [13]. Application absolute size [20]. Comprehensibility also decreases with domains in which comprehensibility is emphasized are for increasing number of variables and constants in a rule [20] instance medicine, credit scoring, churn prediction, and amount of inconsistency with existing domain bioinformatics, and others [8]. knowledge [1, 18]. A metric of comprehensibility is therefore needed in User-oriented assessment of classifier comprehensibility order to compare learning systems performance and as a [1] compared outputs of several tree and rule learning (part of) heuristic function used by a learning algorithm [9, algorithms and concluded that trees are more 21]. Majority of algorithms for learning comprehensible comprehensible then rules, and that in some cases tree size models use simple measures based on model size which may is negatively correlated with comprehensibility. Note that oversimplify the learned models. Humans by nature are the trees included in the study were simple and were mentally opposed to too simplistic representations of probably perceived as less comprehensible because they did complex relations [7], therefore it is no surprise that not agree with the users’ knowledge. Another study [12] empirical studies have shown comprehensibility to be (based on inexperienced users) compared comprehensibility negatively correlated with the complexity (size) of a of decision tables, trees and rules. The results showed that classifier in at least some cases [1]. Such simple measures the respondents were able to answer the questions faster, based on model complexity are therefore regarded as an more accurately and more confidently using decision tables over-simplistic notion of comprehensibility [8]. than using rules or trees and were clearly able to assess the Those facts motivated us to propose a survey design, with difficulty of the questions. Larger classifiers resulted in a the goal to derive an exhaustive comprehensibility metrics decrease in answer accuracy, an increase in answer time, better reflecting the human sense of classifier and a decrease in confidence in answers. Evidence that comprehensibility. Obtained insights into evaluator’s answering logical questions (e.g. validate a classifier) is judgments about classifier comprehensibility will provide 70 considerably more difficult than classifying a new instance validate the classifier, and discover new knowledge. Thus was found. However, proposition that cognitive fit of the second task - explain ask the respondent to answer classifier with the given task type influences users’ which attributes values must be changed or retained in order performance received limited support. A paper on to classify a given instance into another class. For example, comprehensibility of classification trees, rules, and tables, which habits (values of attributes) would a patient with high nearest neighbor and Bayesian network classifiers [8] probability of getting cancer (class) have to change in order stressed that graphical representation, hierarchical structure, to stay healthy? The third task - validate requires the including only subset of attributes in a tree, and respondent to check whether a statement about the domain is independence of tree branches are advantages of confirmed or rejected according to the presented classifier. classification trees. On the other hand, possible irrelevant For example: does the tree say that persons smoking more attributes and replicated subtrees enforced by the tree than 15 cigarettes per day are likely to get cancer. Similar structure decrease comprehensibility and may lead to questions were also asked in [12]. The fourth task - discover overfitting. This can be mitigated by converting a tree into a asks the respondent to find a property (attribute-value pair) rule set, which enables more flexible pruning resulting in a that is unusual for instances from one class; this corresponds more comprehensible representation. Another recognized to finding a property of outliers. For example, people that downside of classification trees is their Boolean logic-based lead healthy life are not likely to get cancer, except if they nature as opposed to the probabilistic interpretation of naïve have already suffered from it in the past. Bayes, which might be preferred in some applications [8]. The fifth task - rate requests the user to give the This paper focuses on the comprehensibility of subjective opinion about the classification trees on a scale classification trees; however most of the suggested ideas with five levels: very easy to comprehend, easy to could be analogically implemented on classification rules comprehend, comprehensible, difficult to comprehend, and and tables as well. The survey design enables analysis of the very difficult to comprehend. Each label of the scale is influence of tree complexity and visualization on its accompanied with an explanation that relates to the time comprehensibility. The complexity of classification tree is needed to comprehend the tree and difficulty of usually measured with the number of leaves or nodes in a remembering it and explaining it to another person. The tree or the number of nodes per branch [16, 20] while the purpose of explanations is to prevent variation in subjective suggested survey considers some additional complexity interpretations of the scale. The task intentionally follows measures as well. The influence of visualization on the first four tasks in which the respondents use the comprehensibility has been stressed [16] but empirical classifiers and obtain hands on experience, which enables studies are missing, therefore the suggested survey also them to rank the comprehensibility. The classifiers are considers visualization factors. The past empirical studies of learned on a single dataset and visualized using Orange tool classifier comprehensibility [1, 12] were performed only on [5] in order to be consistent across all the tasks and enable homogenous groups of students, therefore we suggest reliable and prompt responding. For the same reason adding data mining experts with different cultural meaningful attribute and class names are used. The first five background to the group of participants in future studies. tasks measure the influence of classifier complexity (i.e. the number of leaves, depth, branching) while the final task 3 SURVEY DESIGN measures the influence of different representations of the same tree on the comprehensibility. One possible way to estimate comprehensibility of a Task six - compare asks the respondents to rate which of classifier is to present it to a survey respondent, who will the two classification trees shown side by side is more analyze it, and then conduct an interview about comprehensible on the scale with three levels: the tree is comprehensibility. This approach is very time consuming much more comprehensible, the tree is more and may be unintentionally biased by both involved persons, comprehensible, and the trees are equally comprehensible. e.g. asking a question about comprehensibility of a model One of the trees in this task is already used in the previous may help the respondent in comprehending the classifier. five tasks - serving as a known frame of reference - while Therefore the indirect and more objective approach that was the other one is a previously unseen tree with the same also used in previous studies [1, 12] is preferred. It measures content but represented in different style. The position of a the performance of respondents asked to solve tasks that tree (left or right) is randomized in order to prevent bias, e.g. involve interpretation and understanding of classifiers. The assuming that the left tree is always more comprehensible. following subsections of the paper define the selected survey tasks, performance metrics, observed properties of 3.2 Performance metrics classifiers, and strategies that prevent bias. The tasks rate and compare are directed toward obtaining 3.1 Survey tasks (question types) subjective opinions rated on the given scales. The tasks classify, explain, validate, and discover are directed toward The comprehensibility survey consists of six tasks. The first objectively quantifying respondents’ performance (e.g. time task - classify asks respondent to classify an instance and correctness of answers). Corresponding performance according to a given classifier (same as in [1, 12]). Tasks 2- metrics are derived from the six metrics proposed in the 4 are based on [4], which reports that comprehensibility is experiments on conceptual model understandability [11]. required to explain individual instance classifications, The first three are explicitly measured by the survey: the 71 time needed to understand a model translates to time to using well-known machine learning algorithm rather than answer a question (longer time - less comprehensible manually constructed. Using different pruning parameters classifier); correctly answering questions about the content produces trees with different sizes. Higher branching factor translates to the probability of correct answer (higher can be achieved by replacing original binary attributes with probability - more comprehensible classifier); the perceived constructed attributes, which can be interpreted as building ease of understanding is expressed with subjective judgment deep models [20]. If possible, order of the leaves or at least of a questions difficulty (rated on scale very easy, easy, their grouping in subtrees should remain the same as in the medium, difficult and very difficult). The other measures are binary tree. Choosing a question for a given tree determines implicitly embedded in the survey design: difficulty of the number of nodes in a branch that the user will have to recalling a model is captured through descriptions of the five analyze in order to answer. In each group of questions a levels of comprehensibility scale in the rate task; problem- single parameter changes while the others remain constant. solving based on the model content is embedded in tasks 1- Finally, a well-known and comprehensible classifier 4; and verification of model content is in the validate task. visualization style must be used, e.g. Orange [5]. Order of the question may also induce bias. For 3.3 Observed classifier properties example, the learning effect can occur: the respondents need Motivated by the related work [1, 8, 12, 20] and authors’ more time to answer the first few questions, after that they experience the following tree complexity properties are answer quicker. Next, the performance of respondents drops proposed: number of leaves or nodes, branching factor, if they get tired or loose motivation, therefore the number of number of nodes in a branch, and number of instances questions must be limited. To prevent those effects, Latin belonging to a leaf. Proposed tree complexity properties are square ordering is used, where each question occurs exactly systematically varied in the first five tasks of the survey. once at each place in the ordering and subsequently each Also, the proposed tree visualization properties are varied respondent gets a different ordering of the questions. in the compare task: using color to enhance readability (e.g. Finally, starting each task with a test question (from the pie-charts corresponding to class distributions in nodes), different domain) reduces the learning effect as well. layout of the tree based on the depth of subtrees, and general The survey design assumes the following order of layout and readability of the visualized tree (e.g. plain text tasks: starts with the simpler and progresses to more output vs. default Weka [10] and Orange [5] visualization). difficult ones. The compare and rate tasks, related to Additionally, the survey enables contrasting: meaningful subjective opinions, are placed toward the end – after the names of attributes, attribute values and classes to respondents acquire experience with the classifiers. meaningless ones; attributes with high information gain to Demographic data (DM knowledge, age, sex, language) the ones with low gain; and meaningful aggregated reflects the heterogeneity of the respondents group and attributes contrasted to conjunctions of isolated attribute- enables detailed analysis of classifier comprehensibility per value pairs (i.e. deep structure [20]). Finally, the survey different subgroups like students or experts. Hence, the test design also enables various statistical analysis for the each group consists of data mining experts on one hand and non- single leaf (branch of the tree) or for the entire tree. experts with basic knowledge about classification on the 3.4 Avoiding implicit survey bias other. Comparing the results of the two groups as well as considering the cultural background (e.g. different mother In order to prevent bias the following issues must be tongues), can provide new insights into classifier considered: choice of the classification domain, classifiers, comprehensibility. Finally, obtaining statistically significant and respondents group, and the ordering of questions. The results requires high enough number of respondents. classification domain has to be familiar to respondents - all of them are aware of relations among attribute values and 4 SURVEY IMPLEMENTATION classes and none of them have significant advantage of more in-depth knowledge about the domain. At the same time, the This work proposes online survey in order to facilitate domain must be broad and rich enough to enable learning a accurate measurements of time, automatic checking the range of classifiers with various properties listed in 3.3. correctness of answers, saving the answers in a database and Furthermore, choosing an interesting domain motivates the allowing remote participation. Several tools for designing respondents to participate in the survey. The Zoo domain and performing online surveys exist but do not meet all of from the UCI Machine Learning Repository [3] meets all the the design requirements (see section 3): Latin square design, requirements and is highly appropriate for general and measuring the time of answering each question, automatic heterogeneous population. It requires only elementary translation to several languages, using templates to quickly knowledge about animals expressed with 17 (mostly binary) define questions for a given task and automatically checking attributes: are they aquatic or airborne, do they breathe, how the correctness of answers. Therefore, custom online survey many legs they have, do they have teeth, fins or feathers, is implemented using MySQL database, PHP and JavaScript etc. The Zoo domain induces 7 classes: mammals, fish, programming languages, and CSS for webpage formatting. birds, amphibian, reptile, mollusk, and insect. The database includes one table for demographic data The selected classifiers must vary in complexity but not with auto-increment user id as the primary key and one table in other parameters that may influence comprehensibility per task with user id and question id as the primary key. and hence bias the results. In addition, classifiers are learned Each task table includes a field representing question order 72 number, a date-time field, and field(s) representing the [5] J. Demšar, T. Curk, A. Erjavec. Orange: Data Mining respondents’ answer. Tables for tasks 1-4 additionally Toolbox in Python. Journal of Machine Learning include fields with the measured answering time, list of all Research, 14 (Aug), pp. 2349−2353, 2013. respondent clicks and associated times, and the indicator of [6] P. Domingos. The role of occam's razor in knowledge correct answer. PHP is used to dynamically generate discovery, Data Mining and Knowledge Discovery, 3, survey webpages with correct ordering of questions for pp. 409–425, 1999. each respondent and storing the answers into the database. [7] T. Elomaa. In Defense of C4.5: Notes on learning one- Question webpages are generated by a separate PHP script level decision trees. Proceedings of 11th Int. Conf. on for each task based on a template and a simple data structure ML, pp. 62-69, 1994. defining the questions. An additional PHP script is used as a [8] A. A. Freitas. Comprehensible classification models - a library of shared functions and data structures: one position paper. ACM SIGKDD Explorations, 15 (1), pp. represents instances used in the survey and the other terms 1-10, 2013. (instructions, attribute names and value, classes, etc.) [9] C. Giraud-Carrier. Beyond predictive accuracy: what? translated into English, Slovenian and Croatian languages. Proceedings of the ECML-98 Workshop on Upgrading Additionally, PHP scripts are used for backing-up and Learning to Meta-Level: Model Selection and Data checking correctness of answers, login and help pages, and a Transformation, pp. 78-85, 1998. respondent home-page providing feedback on personal [10] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. progress and performance compared to the group. SVG Reutemann, I. H. Witten. The WEKA Data Mining images representing the classification trees exported from Software: An Update. SIGKDD Explorations, 11 (1), Orange [5] were automatically translated into the three 2009. languages using a Java program – the translation table is the [11] C. Houy, P. Fettke, P. Loos. Understanding same as in the PHP library script. Understandability of Conceptual Models – What Are JavaScript is used to measure the time of answering We Actually Talking about? Conceptual Modeling - each question. When a webpage is opened, only the Lecture Notes in Comp. Sc. vol. 7532, pp. 64-77, 2012. instructions and footer of the page are visible. Clicking on [12] J. Huysmans, K. Dejaeger, C. Mues, J. Vanthienen, B. the button “Start solving” calls a JavaScript function that Baesens. An empirical evaluation of the displays the question (e.g. table with attribute-value pairs comprehensibility of decision table, tree and rule based and image of a tree) and the answer form (drop-down lists, predictive models. Decision Support Systems, 51 (1), radio buttons) and starts the timer. Changing a value of the pp. 141-154, 2011. answer form field records the relative time and action type. [13] Y. Kodratoff, The comprehensibility manifesto, KDD When the respondent clicks the “Finish button” , the answer Nuggets (94:9), 1994. fields are disabled, time is calculated, and question difficulty [14] R. Kohavi. Scaling Up the Accuracy of Naive-Bayes rating options are displayed. When the “Next button” is Classifiers: a Decision-Tree Hybrid. Proceedings of the clicked, the collected values are assigned to hidden form 2nd Int. Conf. on KD and DM, pp. 202-207, 1996. fields in order to pass them to the PHP script that stores the [15] O. O. Maimon, L. Rokach, Decomposition data in the database and displays the next question. Methodology for Knowledge Discovery and Data A psychologist and two DM experts analyzed the initial Mining: Theory and Applications, World Scientific survey and improved version was implemented based on Publishing Company, 2005. their comments. It passed a validation test with 15 students [16] D. Martens, J. Vanthienen, W. Verbeke, B. Baesens. answering the first task at the same time. Preliminary Performance of classification models from a user analysis of the results for 10 respondents is in line with the perspective. Decision Support Systems, 51 (4), pp. 782- expectations, thus the survey is ready to be used in order to 793, 2011. collect data about tree comprehensibility. [17] R. Michalski, A theory and methodology of inductive learning, Artificial Intelligence 20, pp. 111–161, 1983. References: [18] M. Pazzani. Influence of prior knowledge on concept acquisition: experimental and computational results. [1] H. Allahyari, N. Lavesson, User-oriented Assessment of Journal of Experimental Psychology. Learning, Classification Model Understandability, 11th Memory, and Cognition 17, pp. 416–432, 1991. Scandinavian Conf. on AI, pp. 11-19, 2011. [19] Quinlan, J.R. Some elements of machine learning. Proc. [2] I. Askira-Gelman, Knowledge discovery: comprehe- 16th Int. Conf. on Machine Learning (ICML-99), pp. nsibility of the results. Proceedings of the 31st Annual 523-525, 1999. Hawaii Int. Conf. on System Sciences, 5, pp. 247, 1998. [20] E. Sommer. An approach to quantifying the quality of [3] K. Bache, M. Lichman. UCI Machine Learning induced theories. Proceedings of the IJCAI Workshop Repository, http://archive.ics.uci.edu/ml. University of on Machine Learning and Comprehensibility, 1995. California, School of Inf. and Comp. Science, 2014. [21] Z.-H. Zhou. Comprehensibility of data mining [4] M. W. Craven, J. W. Shavlik. Extracting algorithms. Encyclopedia of Data Warehousing and Comprehensible Concept Representations from Trained Mining, pp. 190-195, Hershey, 2005. Neural Networks. Working Notes on the IJCAI’95 WS on Comprehensibility in ML, pp. 61-75, 1995. 73 PAMETNO VODENJE SISTEMOV V STAVBAH S STROJNIM UČENJEM IN VEČKRITERIJSKO OPTIMIZACIJO Rok Piltaver, Tea Tušar, Aleš Tavč ar, Nejc Ambrožič , Tomaž Šef, Matjaž Gams, Bogdan Filipič Institut “Jožef Stefan”, Odsek za inteligentne sisteme Jamova cesta 39, 1000 Ljubljana, Slovenija e-mail: {rok.piltaver, tea.tusar, ales.tavcar, tomaz.sef, matjaz.gams, bogdan.filipic}@ijs.si ABSTRACT potenciala, ki jih sistemi hišne avtomatizacije omogočajo. Zato v [4] predlagajo uporabo tehnik strojnega učenja za Prispevek opisuje programsko opremo za pametno in prepoznavanje navad uporabnikov in gradnjo napovednih celovito vodenje sistemov v stavbi, kot so ogrevanje, modelov njihovega obnašanja ter uporabo večkriterijske prezračevanje, senčenje, razsvetljava in upravljanje z optimizacije za zagotavljanje ustreznega upravljanju viri energije. Cilj je zagotoviti čim nižje stroške in hkrati inteligentnega doma, ki zadovoljuje nasprotujoče si kriterije. čim višje udobje za stanovalce. Sistem pametne stavbe Pričujoči prispevek v 2. razdelku opisuje delovanje pridobi podatke s senzorjev, nameščenih v stavbi, in se iz sistema OpUS, ki implementira predlagane rešitve za njih nauči navad in akcij uporabnikov v preteklem pametno vodenje sistemov v stavbah na podlagi učenja in obdobju. V drugem koraku uporabi večkriterijsko večkriterijkse optimizacije. Rezultati delovanja sistema optimizacijo, ki na podlagi simulacij išče najboljše OpUS so predstavljeni na primeru uporabe v 3. razdelku nastavitve parametrov za vodenje sistemov v stavbi. Prispevek se zaključi z razpravo v 4. razdelku. Uporabniku se najboljše nastavitve parametrov prikažejo v oblik urnikov. Za vsak urnik sta dana dva 2 SISTEM OPUS podatka, udobje in cena, na podlagi katerih uporabnik izbere najprimernejši urnik in s tem na preprost način Programska oprema sistema OpUS, prikazana na sliki 1, je nastavi parametre za avtomatizacijo sistemov v stavbi, ki razdeljena v štiri sklope: beli kvadrati predstavljajo zagotovijo želeni kompromis med udobjem in stroški. vhodno/izhodne module, modra kvadrata ustrezata moduloma za učenje, zelena modulu za optimizacijo in 1 UVOD oranžna moduloma za simulacijo. Številke predstavljajo zaporedje toka podatkov skozi sistem od vhodnih senzorskih Bivalni objekti v Evropi so leta 2004 porabili 37% vse podatkov (1) do parametrov za avtomatizacijo sistemov v porabljene energije [3], v Združenih državah Amerike pa je stavbi (10). Vsebina podatkovnih tokov in delovanje bil v letu 2010 ta delež kar 41% [2]. Iskanje strategij za posameznih modulov sta opisana v nadaljevanju. zmanjšanje porabe energije je torej ena izmed ključnih nalog sodobne družbe in tema številnih raziskav, ki se ukvarjajo z razvojem učinkovitih metod vodenja naprav, ki porabijo 2.1 Pridobivanje senzorskih podatkov veliko energije. Sistemi za ogrevanje, hlajenje in Obstoječi sistemi za hišno avtomatizacijo ponujajo široko prezračevanje prostorov npr. porabijo 50% vse energije, ki jo paleto senzorjev: od senzorjev gibanja, temperature, vlažnosti stanovanjske hiše potrebujejo za obratovanje [3]. Dobre in kakovosti zraka, osvetljenosti, pretoka vode in porabe strategije morajo ustrezno obravnavati nasprotujoče si električne energije do pametnih stikal in podatkov o zahteve uporabnikov, kot sta npr. sočasno doseganje delovanju posameznih naprav. Poleg tega omogočajo tudi energetske varčnosti in visoke stopnje ugodja. zbiranje, shranjevanje in posredovanje senzorskih podatkov Zmanjševanje stroškov shranjevanja in obdelave zunanjim sistemom (slika 1, točka 1). Sistem OpUS uporablja podatkov, dostopnost senzorjev in aktuatorjev ter enostavno modul za pridobivanje senzorskih podatkov, ki mora biti povezovanje različnih naprav v skupen sistem omogočajo prilagojen protokolu komunikacij in formatu podatkov, ki ga uporabo kompleksnih metod vodenja tudi v manjših bivalnih podpira sistem hišne avtomatizacije – to omogoča enotah. Obstoječi sistemi pametnih hiš sicer omogočajo prilagoditev sistema OpUS različnim sistemom hišne avtomatizacijo delovanja sistemov v stavbah po vnaprej avtomatizacije. Pridobljeni senzorski podatki se pretvorijo v nastavljenih urnikih, preklapljanje med načini delovanja poenoteno obliko, ki ob vsaki spremembi shrani čas, tip glede na zaznano prisotnost uporabnikov ali na zahtevo senzorja (določa mersko enoto, natančnost in frekvenco uporabnika preko spletnega vmesnika. Vendar večini meritev ipd.) in identifikacijo senzorja (določa lokacijo uporabnikov ne uspe nastaviti primernega urnika za senzorja in povezavo z zabeleženimi preteklimi vrednostmi) avtomatizacijo, saj morajo pri tem nastaviti veliko pogosto ter novo vrednost. Poenoteni podatki se shranijo v nerazumljivih parametrov in upoštevati nenehne spremembe podatkovno bazo za kasnejše analize in prikaz uporabniku ter svojih potreb in zunanjih vplivov, kot so vreme in cene se na zahtevo posredujejo moduloma za učenje (slika 1, točka energentov. Poleg tega take rešitve ne izrabijo celotnega 2). 74 2.2 Modula za učenje zanj ni dovolj udobno: npr. previsoka temperatura, slaba Pretekle raziskave so pokazale, da lahko z uporabo podatkov osvetljenost ali kakovost zraka. Zbrane podatke modul o prisotnosti uporabnikov in njihovih akcijah napovemo analizira v kontekstu časa, prostora in prepoznane aktivnosti. prisotnost ali odsotnost uporabnikov ter njihove navade z Če zazna, da uporabnik pri določeni aktivnosti v določenem relativno visoko točnostjo [1]. Na tej osnovi sta bila razvita prostoru večkrat izvede enako akcijo, iz tega sklepa, da je modula za učenje navad in akcij, opisana v nadaljevanju. nastavitev scene (opisana v [5]) za to aktivnost in prostor Modul za učenje navad periodično zahteva časovno okno neprimerna ter predlaga njeno spremembo. podatkov, iz katerih prepozna prisotnost in odsotnost Oba modula poleg specifičnih metod za prepoznavanje uporabnikov ter njihove aktivnosti: spanje, pripravo obroka akcij in aktivnosti uporabljata standardne algoritme strojnega in prehranjevanje, uporabo kopalnice ipd. Prisotnost učenja, da zgradita model, ki uporabniku zagotavljajo udobje. uporabnika v določenem prostoru prepozna neposredno iz Model vsebuje podatke o tem, kakšna je verjetnost, da podatkov o uporabi stikal v prostoru in zaznavah senzorjev uporabnik na določen dan v tednu ob določenem času gibanja, ostale aktivnosti pa s pomočjo zlivanja senzorskih potrebuje neko sceno (povezano z aktivnostjo uporabnika), in podatkov in uporabo konteksta: čas, prostor in predhodne kake vrednosti parametrov naj bodo nastavljene za aktivnosti. Npr. prižgana luč v kopalnici in 7 minut pretoka posamezno sceno (temperatura zraka, osvetljenost, zaprta tople vode sovpadata z aktivnostjo uporabe kopalnice; okna idr.) [5]. Model se posreduje modulu za optimizacijo, ugasnjene ali zatemnjene luči, odsotnost gibanja ter drugih kot prikazuje slika 1, korak 3. akcij uporabnikov ob podatku, da oseba ni zapustila stavbe 2.3 Modul za optimizacijo ter da je ura 4 zjutraj, sovpadajo z aktivnostjo spanje. Cilj optimizacije je poiskati nedominirane (t.j. najboljše) Prepoznavanje aktivnosti je pomembno, ker določa okoljske urnike po kriterijih udobja in cene – namesto te se lahko parametre, ki so ob določeni aktivnosti za uporabnika udobni: uporablja tudi količina porabljenih energentov ali količina v času aktivne prisotnosti mora biti temperatura v stavbi posledično izpuščenega CO primerna, zrak svež in ne presuh ali preveč vlažen; v času 2. Ker sta si kriterija udobje in cena nasprotujoča, je izhod postopka optimizacije množica spanja so lahko temperatura, osvetljenost in zaloga tople vode urnikov, ki so med sabo neprimerljivi (boljši v enem kriteriju nižje; v času odsotnosti temperatura in osvetljenost nista in slabši v drugem) in boljši od vseh ostalih urnikov. Urnik je pomembni, okna pa morajo biti zaprta. Samodejno učenje predstavljen kot zaporedje 15-minutnih časovnih intervalov spreminjajočih se uporabnikovih navad odpravi potrebo po za katere je treba določiti parametre vodenja posameznih ročnem (po)nastavljanju urnikov za avtomatizacijo sistemov sistemov v stavbi. Za iskanje nedominiranih urnikov se v stavbi ter hkrati omogoči boljše nastavitve, ki temeljijo na uporablja algoritem večkriterijske optimizacije, ki podatka o natančnih statističnih podatkih o pretekli uporabi. ceni in udobju urnika pridobi od modula za simulacijo (slika Modul za učenje akcij periodično zahteva podatke o 1, korak 6). akcijah uporabnika, ki jih le-ta izvede, kadar okolje v stavbi Slika 1: Programski moduli sistema OpUS in podatkovni tokovi med njimi 75 2.4 Simulacija 3 PRIMER UPORABE Modul za simulacijo na vhodu sprejme (slika 1, korak 5) V tem razdelku predstavljamo rezultate pametnega vodenja model stavbe, ki npr. določa toplotne izgube, porabo energije na primeru stavbe s fotovoltaičnimi paneli, kjer lahko posameznih sistemov ipd.; cene energentov, ki omogočijo določamo polnjenje in praznjenje baterije ter delovanje izračun stroškov določenega urnika; vremensko napoved, ki nekaterih porabnikov, medtem ko so scene omejene na določa pričakovane zunanje vplive na pogoje v stavbi; in nastavljanje želene temperature v stavbi. Stavbo modeliramo model uporabnikovih navad, ki je osnova za izračun udobja s simulatorjem, ki za dani urnik vrača njegove stroške in danega urnika. Simulator je osnovan na obstoječih splošnih udobje. V stroških upoštevamo tudi neporabljeno energijo v simulatorjih delovanja sistemov v stavbah, lastnostih stavbe bateriji, ki predstavlja prihodnji dobiček. Optimizacijo in konkretnih sistemov, prisotnih v stavbi ter metodi za izvajamo z evolucijskim večkriterijskim optimizacijskim izračun neudobja uporabnika glede na okoljske pogoje v algoritmom, ki poišče kompromisne urnike glede na stavbi in želene pogoje. Rezultati simulacije, ki so odvisni od obravnavana nasprotujoča si kriterija. točnosti simulatorja in vhodnih podatkov, se vrnejo modulu Optimizacijski algoritem kot vhodne podatke uporabi za optimizacijo (slika 1, korak 6), ki na njihovi podlagi podani začetni (neoptimirani) urnik stavbe in podatke o ceni predlaga nove urnike (slika 1, korak 4). Po končani energije, napovedi sončne energije, porabnikih in navadah optimizaciji se izbrani urnik in pripadajoče udobje ter cena uporabnikov. Dokler ni izpolnjen ustavitveni pogoj (čas, ki je posredujejo uporabniškem vmesniku (slika 1, korak 7). na voljo za optimizacijo), poteka preiskovanje prostora 2.5 Uporabniški vmesnik urnikov in njihovo vrednotenje preko omenjene simulacije. Vsak urnik opisuje delovanje stavbe med podanima Uporabniški vmesnik ponuja vizualizacijo, iz katere sta začetnim in končnim časom, pri čemer je vmesno obdobje razvidna cena in udobje najboljših urnikov. Uporabniku se ob razdeljeno na 15-minutne intervale. Urnik je sestavljen iz izbiri posameznega ponujenega urnika prikažejo razlike v naslednjih štirih komponent: trajanju in nastavitvah scen ter udobju in ceni med izbranim in trenutno nastavljenim urnikom. Množica najboljših • Temperatura: Za vsak interval določimo želeno (nedominiranih) rešitev omogoča, da uporabnik dobi vso temperaturo v stopinjah Celzija, ki mora zadoščati informacijo o delovanju sistema in se na podlagi te omejitvam (biti mora vsebovana v [Tmin, Tmax], kjer sta informacije odloči, kateri kriterijje zanj pomembnejši ter temperaturi Tmin in Tmax lahko podani za vsak interval kakšen kompromis med kriterijema mu bolj ustreza. posebej). Uporabnik lahko v koraku 8 (slika 1) izbere enega od • Energija+: V primeru, da imamo presežek energije predlaganih urnikov ter ga po potrebi prilagodi svojim željam (fotovoltaični paneli proizvedejo več energije, kot je – v tem primeru se ponovno izvede optimizacija izvajanja hiša porabi), za vsak interval določimo delovanje urnika in simulacija za oceno cene in udobja predlaganih baterije. Možni sta le dve vrednosti, in sicer 1 (baterija sprememb urnika. naj se polni) in 0 (baterija naj se ne polni). Če se baterija Sistem OpUS začne delovati z vnaprej nastavljenim ne polni, presežek energije prodajamo. urnikom, ki je dober približek splošno uporabnega urnika. • Energija– : V primeru, da imamo primanjkljaj energije Skozi čas se sistem nauči navad in potreb uporabnika ter (fotovoltaični paneli proizvedejo manj energije, kot je predlaga boljše urnike. Izboljšan urnik je primeren za hiša porabi), za vsak interval določimo delovanje uporabo, dokler ne pride do sprememb navad uporabnikov ali baterije. Možni sta le dve vrednosti, in sicer 1 (baterija do spremembe zunanjih vplivov: vremena kot posledice naj se prazni) in 0 (baterija naj se ne prazni). Če se letnih časov ali bistvene spremembe cen energentov na trgu. baterija ne prazni, potrebno energijo črpamo iz omrežja. Poleg izbire in primerjave urnikov uporabniški vmesnik • Porabniki: Za vsakega porabnika določimo čas, ko naj ponuja tudi pregled nad preteklo porabo in skladnostjo začne delovati, t.j. porabljati energijo. Končni čas in izbranega urnika s prepoznanimi potrebami uporabnika ter količina porabljene energije se izračunata iz lastnosti ročno upravljanje s sistemom za hišno avtomatizacijo. porabnika. 2.6 Vodenje sistemov v stavbi Optimizacijo smo preizkusili na naslednjem konkretnem Izbrani urnik in pripadajoči parametri za vodenje sistemov v primeru. Želimo optimirati vodenje stavbe s fotovoltaičnimi stavbi se iz uporabniškega vmesnika pošljejo modulu za paneli, eno baterijo in enim porabnikom, ki mora delovati pošiljanje navodil za avtomatizacijo (slika 1, korak 9). Le-ta enkrat dnevno. Zanima nas vodenje stavbe za dva naslednja je izhodni modul, ki mora biti prilagojen konkretnemu dneva: v prvem je napovedano jasno (sončno) vreme, v sistemu hišne avtomatizacije podobno kot modul za drugem pa pretežno oblačno vreme. Začetni urnik je določen pridobivanje senzorskih podatkov. Poleg določenega urnika na podlagi vremenske napovedi in uporabnikovih navad ter modul sprejme tudi podatke o zaznanih aktivnostih želenih temperatur. Optimizacijo izvajamo, dokler ne uporabnikov, ki jih prepozna modul za učenje navad, na pregledamo 1000 urnikov. podlagi katerih preklaplja med scenami, kadar se pričakovana Slika 2 predstavlja rezultate tega poskusa. Zeleni krožci aktivnost na urniku ne ujema z zaznano aktivnostjo (slika 1, prikazujejo vse generirane urnike. Začetni urnik je obarvan korak 10). rdeče, nedominirani urniki (vseh je deset) pa so predstavljeni 76 Slika 2: Vsi urniki dobljeni po postopku več kriterijske optimizacije z modrimi pikami. Kot lahko vidimo, optimizacijski algoritem najde različne kompromise med stroški in udobjem, ki so po obeh kriterijih boljši od začetnega urnika. Najugodnejši dobljeni urnik je podrobneje obrazložen v nadaljevanju. Slika 3 prikazuje dva grafa s podrobnejšo informacijo o najugodnejšem dobljenem urniku. Siva območja označujejo obdobja, ko uporabnik ni prisoten v stavbi. Takrat se meri samo poraba energije, ne pa tudi udobje. Zgornji graf na sliki 3 prikazuje ciljno temperaturo (tisto, ki ustreza največjemu Slika 3: Podrobnosti najugodnejšega urnika. Siva območ ja udobju), nastavljeno temperaturo ter dejansko temperaturo, ki označ ujejo obdobja, ko uporabnik ni prisoten v stavbi. jo izmeri simulator, ko stavba poskuša voditi sisteme gretja in ohlajanja tako, da se čim bolj približa nastavljeni udobnejši in cenejši od ročno nastavljenih urnikov, saj se temperaturi. Vidimo, da je razkorak med nastavljeno in lahko sproti prilagajajo zunanjim vplivom in potrebam dejansko temperaturo precejšen predvsem v obdobjih uporabnikov. Uporaba takega sistema odpravi tudi potrebo po neprisotnosti, ko se stavba ne hladi oz. ogreva. Spodnji graf ročnem nastavljanju urnikov in hkrati spodbuja energetsko kaže, kaj se v določenem intervalu dogaja z energijo. Baterija varčnost uporabnikov. se včasih polni (pozitivna energija) in včasih prazni (negativna energija), energijo prodajamo (pozitivna energija) LITERATURA in kupujemo (negativna energija), vidimo tudi, v katerih [1] Gjoreski M., Gjoreski H., Piltaver R., Gams M. intervalih obratuje porabnik. Na grafu je označena tudi tarifa "Predicting the arrival and the departure time of an kupovanja energije, ki je lahko bodisi visoka bodisi nizka. V employee." Zbornik 16. mednarodne multikonference prvem, sončnem dnevu, fotovoltaični paneli proizvedejo Informacijska družba – IS 2013, str. 43–46. veliko energije, ki se deloma shrani v baterijo, deloma pa [2] Annual Energy Outlook 2012. U.S. Energy Information proda. V drugem dnevu je takšne energije zelo malo. Večina Administration (EIA), 2012. energije, ki se je shranila v baterijo, ostaja v bateriji tudi po [3] Perez-Lombard L., Ortiz J. in Pout C. "A review on koncu urnika (za porabo v prihodnjem obdobju). buildings energy consumption information." Energy and Buildings, 40 (3): 394–398, 2008. 4 RAZPRAVA [4] Šef T., Piltaver R., Tušar T. "Projekt OpUS: optimizacija Prispevek opisuje arhitekturo programske opreme, ki s upravljanja energetsko učinkovitih pametnih stavb." pametnim vodenje sistemov v stavbi rešuje pereč problem Zbornik 16. mednarodne multikonference Informacijska zagotavljanja visoke stopnje udobja in hkrati nizkih stroškov. družba – IS 2013, str. 110–113. Arhitektura temelji na ideji uporabe strojnega učenja [5] Tavčar A., Piltaver R., Zupančič D., Šef T., Gams M. uporabnikovih navad in potreb ter večkriterijske optimizacije "Modeliranje navad uporabnikov pri vodenju pametnih parametrov vodenja na podlagi simulacije. Poleg arhitekture, hiš." Zbornik 16. mednarodne multikonference ki omogoča vključitev v obstoječe sisteme pametnih stavb, so Informacijska družba – IS 2013, str. 114–117. predlagane tudi formalne predstavitev problemov učenja, optimizacije in simulacije ter algoritmi, ki so primerni za reševanje teh problemov. Primer uporabe predlaganih rešitev kaže, da je tak sistem sposoben predlagati urnike, ki so 77 DETERMINATION OF CLASSIFICATION PARAMETERS OF BARLEY SEEDS MIXED WITH WHEAT SEEDS BY USING ANN Kadir Sabancı1, Cevat Aydın2 1 Department of Electrical and Electronics Engineering, Batman University, Batman, Turkey 2 Department of Agricultural Machinery, Selçuk University, Konya, Turkey Tel: +904882173500; fax: +904882173601 e-mail: kadir.sabanci@batman.edu.tr ABSTRACT One of the basic problems that cause loss of yield in wheat is weed seeds that mixed with wheat seeds. In this performance characteristics to biological neural networks study, discrimination of barley seed which mixed with [8]. Simply, ANN that imitates the function of the human wheat seeds has been realized. Classification of wheat brain has several important features such as learning from and barley seeds has been achieved by using artificial data, generalizing, working with an unlimited number of neural network and image processing techniques. In the variables etc. study, image processing techniques and the use of It is seen that ANN is used in crop production, which artificial neural network have been made possible with constitutes an important field of agriculture engineering, in Matlab software. By using Otsu method, histogram data identification and classification stages of a wide range of of seed images that were taken from web camera was agricultural products such as grape, wheat, peppers and obtained. By using histogram data, with multi-layered olives [9]. artificial neural network model, the system was In this study, a software has been developed for educated and classification was made. Besides, wheat distinguishing the wheat and barley seeds which has been and barley seeds in the picture info where mixed seeds mixed during harvesting. Wheat and barley seeds which has taken from the web camera exist were counted. been mixed, have been attempted to distinguish by using image processing techniques and artificial neural networks. Multilayer artificial neural networks has been performed for 1 INTRODUCTION the process to be more precise and faster. System has been trained by using barley and wheat seeds pictures. Wheat and Quality is one of the important factors in agricultural barley seeds have been classified successfully by using products marketing. Grading machines have great role in improved system. This study exemplifies image processing quality control systems. The most efficient method used in and artificial neural networks in agriculture. grading machines today is image processing. Digitisation of the image is the process in which the image in the camera is converted to electical signals with optical – 2 MATERIAL METHOD electrical mechanism [1]. Image processing, as a general term, is manipulation and In this study, image of wheat and barley seeds photos have analysis of the pictorial information [2]. been taken by using a webcam with 1.3 MP (Mega Pixels) Image processing techniques are used in different areas such and having CCD sensor. Usage of image processing and as industry, security, geology, medicine, agriculture. Image artificial neural networks are provided by Matlab. In this processing and artificial neural networks are used in study, 50 wheat seeds, 50 barley seeds were used. Black agriculture in fruit color analysis and classification, root background is used at the stage of image processing for growth monitoring, measurement of leaf area, determination faster and correct results. of weeds [3,4,5,6,7]. Firstly, wheat and barley seeds image information was Artificial neural networks is an information processing received to obtain image informations that was to enter to system which have been exposed with inspiration of Artificial neural networks. Picture information of wheat and biological neural networks and includes some similar barley seeds are shown in Figure 1. 78 Figure 3: Binary image information belong to wheat and Figure 1: Wheat and barley seeds barley seeds Wheat and barley seeds image information was converted to gray level images. Filtration was performed to pictures In this study, Matlab Software’s Artificial Neural Network tor reduce noise and interference. Wheat and barley seeds toolbox were used to distinguish wheat and barley seeds. pictures which were converted into gray levels are shown in ANN 's main tasks are to learn structure in the model data Figure 2. set, to make generalizations in order to fulfill to required task. To make this, the network is trained with the samples of related event to make generalization. Multi-layered artificial neural networks are the most commonly used in ANN models. Figure 2: Gray level images belong to wheat and barley seeds Image information which is at gray level were converted to black and white picture by using Otsu method. Otsu algorithm provides the clustering of these pixels according to the distribution of pixel values in the image. Thresholding process is one of the important processes in image processing. Especially, this method is used for highlighting closed and discrete areas of the object in the image. It Figure 4: Multi-layered artificial neural network includes the arrangement of image which was divided into pixels until to the image in dual structure. Simply, In the study, neural network model with multilayer, thresholding process is a process of discarding pixel values feedforward, back propagation was used. Multilayer on the image according to specific values, and replacing Perceptron (MLP) networks are a feedforward neural other value / values. Thus determination of object lines and network model which has different number of neurons in the backgrounds of the object on the image were provided [10]. input layer, an intermediate layer consisting of one or more Threshold value is determined by using Otsu method. if it is layers(s) and consisting of output layer. The structure of under this value, pixels are converted to 0 value; if it is over MLP neural network is shown in Figure 4. MLP neural this value pixels are converted to 1 value. Wheat and barley network outputs of the neurons in a layer are connected to all seeds pictures in black and white pictures are shown in input of the neurons with weights. The number of neurons in Figure 3. the input and output layer is determined according to the implementation problems. The number of intermediate layers, number of neurons in the intermediate layer and activation function are determined by the designer by trial and error method [11]. 79 Segmentation process was performed by using digital image References processing techniques on images belonging to mixed wheat and barley seeds and by determining the place of each seeds [1] Yaman, K., 2000. Görüntü işleme yönteminin Ankara on the picture. Each pistachios was cropped in 100x100 hızlı raylı ulaşım sistemi güzergahında sefer pixels size. First of all, digital images of each seeds were aralıklarının optimizasyonuna yönelik olarak converted to gray level images. incelenmesi. Yayınlanmamış Yüksek Lisans Tezi, Gazi Picture was filtered in order to remove noise and very small Üniversitesi, Fen Bilimleri Enstitüsü. objects (dust, etc..). Noise removed gray level pictures was [2] Castelman, R. K., 1996. Digital image processing. converted to black and white picture by using Otsu method. Prentice hall, Englewood Cliffs, New Jersey, USA. data sets, which will enter to ANN, will be created by Neuman, M. R., H. D. Sapirstein, E. Shwedyk and W. converting black and white picture informations in 100x100 Bushuk. 1989. Wheat grain colour analysis by digital size of each seeds to column matrix. image processing. II. Wheat class discrimination. Journal of Cereal Science 10: 183-188. [3] Keefe, P. D. 1992. A Dedicated wheat grain image 3 CONCLUSION analyzer. Plant Varieties and Seeds 5: 27-33. Classification process with MLP model average success of [4] Trooien, T. P. and D. F. Heermann, 1992. Measurement the test was determined %100 in the structure, where 100 and simulation of potato leaf area using image neurons are used, in the hidden layer. When creating the processing.Model development. Transactions of the MLP structure, neurons in the hidden layer and output layer ASAE 35(5):1709-1712. activation function was used as logarithmic sigmoid. The [5] Pérez, A. J., F. Lopez, J. V. Benlloch and S. error back-propagation was used in training of the ANN Christensen. 2000. Colour and shape analysis techniques model algorithm and network was trained 250 steps. The for weed detection in cereal fields. Computers and results which was obtained in classification byusing MCA Electronics in Agriculture 25: 197-212. process are presented in Table I. [6] Dalen, G. V. 2004. Determination of the size distribution and percentage of broken kernels of rice using flatbed scanning and image analysis. Food Table 1: Classification results with using MCA process Research International 37: 51-58. [7] Jayas, D. S. and C. Karunakaran. 2005. Machine vision Number Of system in postharvest tecnology. Stewart Postharvest Classification success (%) Neurons in Rewiev, 22. Hidden Wheat Barley Average [8] Fausett, L., 1994. Fundamentals of Neural Networks: Layer seeds seeds success Architectures, Algorithms and Applications, Prentice Hall. 25 44 48 92 [9] Kavas, G., Kavas, N., 2012. Gıdalarda yapay sinir ağları 50 45 49 94 ve bulanık mantık. DÜNYA yayıncılık, GIDA Dergisi 2012-01: 93-96. 75 47 49 96 [10] Yaman, K., Sarucan, A., Atak, M., Aktürk, N., 2001. Dinamik çizelgeleme için görüntü işleme ve ARIMA 100 50 50 100 modelleri yardımıyla veri hazırlama. Gazi Üniv. Müh. Mim. Fak. Dergisi, 16(1): 19-40. In this study, gray level images information of wheat and [11] Öztemel E., 2003. Yapay Sinir Ağları. İstanbul: Papatya Yayıncılık barley seeds by using image processing techniques. Afterwards, the system was trained by using Otsu Method, by converting binary picture information, by using multilayer neural network model. Then, in the realized system, the distinguishing of mixed wheat and barley seeds was performed. System can be developed by using moving band and camera system and distinguishing of wheat and barley seeds can be carried out in real-time. Also, packaging process of seeds in a certain number can be performed. This study is an example of using image processing and neural network in agricultural field. 80 NOVI GOVOREC: NARAVNO ZVENEČ KORPUSNI SINTETIZATOR SLOVENSKEGA GOVORA Tomaž Šef Odsek za inteligentne sisteme, Institut “Jožef Stefan”, Jamova cesta 39, 1000 Ljubljana e-mail: tomaz.sef@ijs.si POVZETEK organizacijami, ker s tem podjetje pridobi dostop do V najnovejšega znanja in tehnologij. Povečuje se delež članku je predstavljen prototip novega naravno zvene vlaganj v raziskovalno razvojno dejavnost v celotnem čega korpusnega sintetizatorja slovenskega govora. Temelji na govorni zbirki, ki jo razvijata prometu podjetja, poveča pa se tudi obseg sredstev za Institut »Jožef Stefan« in podjetje Amebis. Trenutna raziskovalno razvojno dejavnost. demo verzija prototipa sintetizatorja uporablja Pričujoče delo je nastalo v okviru projekta »Analiza in četrtino te zbirke. Podprta sta po en moški in en ženski glas. ovrednotenje naprednih tehnologij govorjenega jezika v Prototip je bil razvit v okviru projekta »Analiza in pametnih stavbah« (Raziskovalni vavčer 2012, Amebis ovrednotenje naprednih tehnologij govorjenega jezika v d.o.o., Kamnik). Namen projekta oz. raziskave je pridobitev pametnih stavbah« (Raziskovalni vavčer 2012, Amebis novega znanja in spretnosti za nadgradnjo obstoječih d.o.o., Kamnik). sistemov govornih in jezikovnih tehnologij z namenom uporabe v sodobnih inteligentnih vmesnikih pametnih stavb. 1 UVOD Posebna pozornost je namenjena dinamičnemu podajanju govornih informacij. Projekt vsebuje dve aktivnosti. V Za angleški jezik in druge večje jezike so različni govorno okviru prve aktivnosti so se kritično analizirale in podprti sistemi že nekaj časa dosegljivi in imajo zelo širok ovrednotile napredne tehnologije govorjenega jezika v krog uporabnikov. V zadnjem času se čedalje pogosteje pametnih stavbah. Sinteza govora, razpoznavanje govora, uporabljajo tudi v različnih mobilnih aplikacijah, ki pa v razpoznavanje govorcev ter njihovega psihofizičnega stanja našem domačem slovenskem jeziku žal niso dostopne oz. ne s pomočjo računalniške analize govorjenega zvočnega delujejo. signala, odpirajo povsem nove dimenzije razvoja Najbolj naravno zveneči sintetizatorji govora temeljijo na inteligentnih uporabniških vmesnikov. Govorni vmesniki so korpusni sintezi. Metoda temelji na preiskovanju vnaprej nadvse primerna tudi kot pomoč invalidom (npr. slepim in posnete in označene govorne zbirke. Išče se zaporedja tistih slabovidnim), starejši populaciji in nekaterim drugim posnetih glasov pri katerih se želene lastnosti čim bolj družbenim skupinam. Druga aktivnost se osredotoča ujemajo. Kvaliteta takšnih sintetizatorjev govora je predvsem na dinamično podajanje informacij ali opozoril v predvsem odvisna od zasnove govorne zbirke na kateri govorni obliki. Takšni sistemi so jezikovno odvisni, zato temeljijo. V splošnem velja, da je sintetizator govora tujih rešitev ni mogoče kupiti oz. ustrezno prilagoditi našim kvalitetnejši, če uporabljamo za sintezo daljše osnovne potrebam. V Sloveniji se pojavlja čedalje večja potreba oz. segmente s čim manj spremembami prozodičnih parametrov, povpraševanje po kvalitetnem, naravno zvenečem, saj te povzročajo dodatna popačenja sintetiziranega govora razumljivem, čim širše sprejemljivem in splošno dostopnem [1]. Strošek razvoja korpusnih sintetizatorjev je izredno govornem bralniku slovenskih besedil, zato je bila v okviru visok, zato je večinoma na razpolago le omejeno število druge aktivnosti predlagana čim bolj optimalna zasnova in glasov. izvedba takšnega sistema. Rezultat te aktivnosti je tudi Komercialne raziskave s področja govornih in jezikovnih prototip novega Govoraca, ki bo v nadaljevanju podrobneje tehnologij so pri nas pogojene z majhnostjo slovenskega predstavljen. trga. Iz stroškovnega vidika je povsem vseeno ali razvijamo npr. sintetizator govora za jezik, ki ga govori milijarda ljudi, 2 GOVORNA ZBIRKA ZA KORPUSNO SINTEZO ali pa zgolj dva milijona. Slovenski trg je izredno majhen GOVORA zato brez spodbud in subvencij s strani države razvoj tako kompleksnih tehnoloških izdelkov in storitev ni mogoč. Ob Najpomembnejša dejavnika pri snovanju govorne zbirke za ustrezni subvenciji se za podjetje zmanjša tveganje potrebe korpusne sinteze govora sta izbira njene vsebine in (pre)velikih vlaganj v raziskave in razvoj, zato je podjetje označevanje posnetkov. Izbira velikosti govorne zbirke je pripravljeno vložiti tudi del lastnih sredstev. Potrebno je posledica kompromisa med želenim številom variacij glasov učinkovito sodelovanje z raziskovalno razvojnimi oz. njihovim pokritjem na eni strani ter časom in stroški vezanimi na razvoj na drugi strani. Upoštevati je potrebno 81 tudi čas za kasnejše preiskovanje govorne zbirke in potreben • Doprinos povedi je enak vsoti vseh ocen zaželenosti prostor za njeno hranjenje [2]. Kakovostna korpusna sinteza nizov (iz spiska), ki se v povedi pojavijo. zahteva, da ima govorna zbirka pravilno označeno tako • Doprinos posamezne povedi normiramo z dolžino identiteto posameznih govornih segmentov kot njihov povedi (št. besed v povedi ali št. fonemov v povedi). natančen položaj znotraj zbirke. Običajno avtomatskim • Določimo takšno utež, da bodo dolžine izbranih metodam in postopkom sledi »ročno« popravljanje oznak, ki stavkov čim bolj ustrezale statistični porazdelitvi ga je ne glede na hiter razvoj tehnologije še vedno zelo dolžin stavkov iz korpusa. veliko. • Izberemo poved z najvišjim normiranim doprinosom. 2.1 Zasnova govorne zbirke • Iz spiska odstranimo vse glasovne nize, ki jih izbrana Razvoj govorne zbirke za korpusno sintezo govora poved vsebuje. obsega naslednje korake: • Ponovno ocenimo vsako poved in izberemo • ustvari se obsežno tekstovno zbirko besedil, ki najboljšo (glede na novi spisek v katerem so izločeni pokriva različne zvrsti (dnevni časopis, revije, tisti glasovni nizi, ki smo jih že pokrili) ter leposlovje ipd.), popravimo spisek. • iz zbirke besedil se odstrani vse oznake vezane na • Postopek ponavljamo dokler ne izberemo želenega oblikovno podobo (glava besedila, tabele ipd.), števila povedi. • okrajšave, števila ipd. se pretvori v polno besedno 4. Ovrednotenje rezultatov: obliko (normalizacija besedil), • Vsakih 1000 povedi izdelamo statistiko difonov, • besedila se pretvori v predvideni fonetični prepis trifonov, štirifonov in drugih polifonov, ki jih že (grafemsko-fonemska pretvorba), pokrivamo (gre za glasovne nize, ki smo jih do takrat • optimizira se obseg zbirke glede na vnaprej že izločili iz zgoraj omenjenega spiska). pripravljene kriterije (metoda požrešnega iskanja); 5. Dodatne izboljšave algoritma: doseči želimo statistično ustrezno vzorčenje • Ker mora zbirka vsebovati vse možne kombinacije izbranega področja govorjenega jezika, difonov, algoritem popravimo tako, da difone • izbrane stavke se posname (ali pa se izlušči del dodatno utežimo glede na ostale polifone. Na takšen obstoječih zvočnih zapisov), način bo algoritem na začetku dajal prednost • posneto govorno gradivo se fonetično in prozodično povedim, ki bodo pokrile čim več novih difonov. označi (samodejno grobo označevanje, fino ročno Predvidoma se vsi difoni pokrijejo že po ca. 100 popravljanje). stavkih. Postopek za čim optimalnejšo izbiro povedi: • Pri trifonih in štirifonih upoštevamo pri robnih 1. Statistič na obdelava besedil: glasovih tudi podatek o glasovni skupini, ki ji • Statistično obdelamo celoten besedni korpus in pripadajo (npr. štirifon "krak" ne bo doprinesel prav določimo pogostost pojavljanja posameznih glasov in dosti novega v našo zbirko, če ta že vsebuje štirifon glasovnih nizov v besedilu. Pri tem razlikujemo še "krat"; zato oceno koristnosti takega štirifona med naglašenimi in nenaglašenimi glasovi ter glasovi, popravimo navzdol). To lahko naredimo preprosto ki se pojavljajo na koncu stavka (oz. na mestih tako, da v spisek vnesemo dodatne nize skupaj z zajema zraka - ločila). Presledke na drugih mestih njihovimi frekvencami pojavljanja v korpusu (primer lahko ignoriramo oz. odstranimo. takega štirifona: "k"+"r"+"a"+"pripornik"). • Vključimo vse stavke (povedne, velelne, vprašalne • Algoritem z različnim uteževanjem izboljšamo tako, itd.) in izdelamo statistiko posameznih vrst povedi oz. da končni nabor vsebuje različne povedi (povedne, stavkov. vprašalne, velelne, enostavne, sestavljene, 2. Izdelava spiska glasovnih nizov z oceno zaželenosti naštevanje, itd.). Tako lahko isti korpus učinkovito posameznega niza: uporabimo tudi za generiranje prozodičnih • V spisek vključimo nabor vseh teoretično možnih parametrov pri sintezi govora. kombinacij difonov; tudi tiste na katere pri statistični obdelavi nismo naleteli (zaradi robustnosti 2.2 Snemanje govorne zbirke sintetizatorja govora). Snemanje govorne zbirke je potekalo v studiu RTV • V spisek vključimo vse trifone, štirifone in (po Slovenija ob prisotnosti izkušenega tonskega tehnika. Med potrebi) ostale zaželene (najpogostejše) polifone, na 10 profesionalnimi govorci smo izbrali najustreznejši moški katere smo naleteli pri statistični obdelavi besedil. in ženski glas. Med branjem besedila so govorci imeli • Utež oz. ocena zaželenosti niza je odvisna od nameščene elektrode Laryngographa, s katerimi smo pogostosti njegovega pojavljanja v besedilu. spremljali nihanje glasilk za lažje kasnejše označevanje 3. Postopek izbire povedi: period govornega signala. Samo snemanje je zaradi • Ocenimo doprinos glasovnih nizov za vsako poved iz obsežnosti besedila, ki ga je bilo potrebno prebrati trajalo tekstovnega korpusa. več mesecev. Pri tem so nastavitve opreme ves čas ostale 82 nespremenjene. Pred vsakim snemanjem je govorec poslušal Spremenjenih oz. na novo napisanih je le nekaj modulov: svoje predhodne posnetke, s čimer se je skušalo zagotoviti • modul za nastavljanje prozodičnih parametrov je čim bolj enak način govora, z enako intonacijo ipd. izpuščen; optimizacija nastavljanja teh parametrov je sestavni del algoritma za izbiro najustreznejših 2.3 Statistični podatki o govorni zbirki govornih segmentov V tabeli 1 so podani osnovni statistični podatki o govorni • možnost spreminjanja govornih parametrov je zbirki za korpusni sintetizator slovenskega govora Amebis namenoma okrnjena; algoritem skrbi le še za glajenje Govorec. prehodov na mestih lepljenja. Potek korpusne sinteze [4, 5]: Velikost besednega korpusa 7.145.345 povedi • na razpolago imamo večje število primerkov 77 milijonov besed posamezne enote, Obseg govorne zbirke 4.000 povedi • za vsak segment (difon), ki ga potrebujemo pri (46.785 besed) sintezi, v govorni bazi poiščemo takšnega, ki bo Število različnih difonov 1.883 »najbolje« sintetiziral ciljni segment, Število različnih trifonov 21.369 • najboljše zaporedje segmentov je tisto, ki minimizira (št. kombinacij v korpusu) (24.702) ciljno ceno (angl. »target cost) in ceno združevanja Tabela 1: Statistič ni podatki o govorni zbirki (angl. »joint cost«) segmenta; problem je rešljiv z Amebis Govorec Viterbijevim algoritmom, n n n n target join 3 NOVI GOVOREC C( t , u ) = ∑ C ( t , u ) + ∑ C ( u , u ) 1 1 i i i 1 − i i 1 = i=2 Novi Govorec za sintezo neomejenega slovenskega govora v osnovi ohranja nespremenjeno arhitekturo (slika 1) [3]: ui predstavlja parametre i-tega segmenta, ui-1 • analiza besedila (predobdelava besedila, grafemsko parametre njemu predhodnega segmenta, ti pa ciljne fonemska pretvorba), parametre i-tega segmenta, • nastavljanje prozodičnih parametrov (trajanje, • prva vsota ponazarja ceno zaradi razlike med ciljno osnovna frekvenca, amplituda, premori) in in dejansko vrednostjo parametrov izbranih • generiranje govornega signala (izbira osnovne enote, segmentov, drugi vsota pa ceno zaradi neujemanja lepljenje, sprememba govornih parametrov). parametrov na mestu spajanja dveh segmentov, • parametri, ki jih upoštevamo pri računanju ciljne cene so: tip fonema, fonetični kontekst, naglas, pozicija znotraj besede in povedi, tip povedi, f0, trajanje ipd; posamezni parametri so različno uteženi (wk), p t C ( t , u ) t t = ∑ w C ( t , u ) i i k k i i k 1 = • parametri, ki jih upoštevamo pri računanju cene združevanja pa so: ujemanje f0, ujemanje energije, ujemanje formantov in drugih spektralnih karakteristik (MFCC koeficienti); tudi tukaj so posamezni parametri različno uteženi (wk) p j C ( u , u ) j j = ∑ w C ( u , u ) i 1 − i k k i 1 − i k 1 = • uteži pri računanju cene združevanja nastavljamo ročno s poslušanjem, • uteži pri računanju ciljne cene lahko izračunamo avtomatično [4] s povezavo akustičnih razdalj ter višjenivojskih fonetičnih in prozodičnih parametrov; uporabimo linearno regresijo. Algoritmi, ki združujejo daljše segmente, se izkažejo za boljše, zato k temu »teži« večina sodobnih algoritmov. Optimizirajo se predvsem fonetični in prozodični parametri, cena zderuževanja zaradi akustičnih Slika 1: Zgradba sistema Amebis Govorec za sintezo parametrov je bolj v ozadju oz. se sploh ne upošteva. slovenskega govora 83 4 SKLEP [3] T. Šef, Analiza besedila v postopku sinteze slovenskega Izdelali smo prototip kvalitetnega, naravno zvene govora, doktorska disertacija, Fakulteta za računal- čega, razumljivega in široko sprejemljivega, korpusnega ništvo in informatiko, Univerza v Ljubljani, 2001. sintetizatorja slovenskega govora. Zaenkrat je [4] A. Hunt, A. Black: Unit selection in a concatenative implementiranih le nekaj osnovnih algoritmov; naprednejši speech synthesis system using a large speech database algoritmi so še v razvoju in se testirajo. Sintetizator trenutno Proceedings of ICASSP 96, vol 1, pp 373-376, 1996. uporablja le četrtino govornega korpusa (ca. 1000 stavkov [5] P. Taylor: Text-to-Speech Synthesis, Cambridge na glas). University Press, 2009. Uporabljena govorna zbirka pokriva skoraj vse možne [6] T. Šef, M. Romih: Zasnova govorne zbirke za kombinacije difonov in trifonov na katere smo naleteli pri sintetizator slovenskega govora Amebis Govorec, analizi besednega korpusa s preko 7 milijoni povedi. Zbornik 14. mednarodne multikonference Snemanje govorne zbirke (moški in ženski glas) je potekalo Informacijska družba, zvezek A, str. 88-91, 2011. več mesecev. Za vsak glas je bilo prebranih preko 4.000 [7] M. Rojc, Z. Kačič: Design of optimal Slovenian speech povedi povprečne dolžine 11 besed. Za lažje označevanje corpus for use in the concatenative speech synthesis zbirke smo poleg govornega signala posneli še signal system, Proceedings of the Second international Laryngographa, ki prikazuje nihanje glasilk. Sledil je ročni conference on language resources an evaluation, pregled posnetega gradiva in grobo samodejno označevanje; Athens, Greece, str. 321-325, 2000. temu sledi še fino popravljanje napak. Gre za najobsežnejšo [8] J. Žganec Gros, A. Mihelič, N. Pavešič, M. Žganec, S. izdelano govorno zbirko namenjeno sintezi slovenskega Gruden: AlpSynth – Concatenation-based Speech govora do sedaj [6,7,8]. Synthesis for the Slovenian Language, 47th Novi Govorec je že na začetku razvoja presegel naša International Symposium ELMAR-2005, Zadar, pričakovanja. V veliko delih je umetno generirani govor tako Hrvaška, str. 213-216, 2005. dober, da ga marsikateri poslušalec težko oz. sploh ne loči od običajnih posnetkov (še posebej, če so ti predvajani preko mobilnih komunikacijskih naprav ipd.). Naravnost in razumljivost govora sta povsem primerljiva s sintetizatorji govora za druge večje jezike. Poslušanje takšnega govora ni naporno, zato je sintetizator primeren za najširši krog potencialnih uporabnikov. Z nadaljnjim razvojem lahko upravičeno pričakujemo še dodatno občutno izboljšanje sintetiziranega govora. Do konca leta bo v novega Govorca vključen celotni govorni korpus, osnovni algoritmi pa bodo nadgrajeni z naprednejšimi in kompleksnejšimi. Govorni korpus bo dodatno pregledan in »očiščen« vseh zaznanih napak. Pri izbiri govornih enot bo uporabljena večkriterijska optimizacija glede na akustične, fonetične in prozodične kriterije. Uporabnik si bo sam izbral ali mu je ljubše, da sintetizator govora govori s čim bolj naravno prozodijo, ali pa mu je pomembnejša razumljivost in zveznost akustičnih parametrov, izbiranje čim daljših govornih enot ter njihovo lepljenje na fonetično najprimernejših mestih. Z drsniki ali izbiro na Pareto fronti bo na preprost način podal svoje preference. Literatura [1] A. Mihelič, J. Gros, N. Pavešič, M. Žganec: Pridobivanje govorne zbirke za korpusni sintetizator govora Phonectic, Zbornik konference Jezikovne tehnologije, str. 45-49, 2000. [2] I. Amdal, T. Svendsen: Unit selection Synthesis Database Development Using Utterance Verification, Zbornik INTERSPEECH 2005, str. 2553-2556, 2005. 84 CLOUD-BASED RECOMMENDATION SYSTEM FOR E- COMMERCE Gašper Slapnič ar1,2, Boštjan Kaluža2 1Faculty of Computer and Information Science, Večna pot 113, 1000 Ljubljana, Slovenia 2Department of Intelligent Systems, Jožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia E-mail: slapnicar.gasper@gmail.com, bostjan.kaluza@ijs.si ABSTRACT 2 CLOUD-BASED MACHINE LEARNING PLATFORMS This paper leverages cloud-based machine learning platform to implement an item-based recommendation Recently, a wide variety of cloud-based recommendation system for an e-commerce application. The solution is systems emerged. All of them offer various machine- based on Prediction.IO platform, which offers a full- learning algorithms, while the implementations of the stack architecture based on MongoDB database, Hadoop prediction model generation and supported algorithms differ. framework for distributed processing, Apache Mahout In our study, we focused on the currently available systems, scalable machine learning library, and RESTful API. We which offered distributed, scalable, full-stack architecture, implemented an item-based recommendation engine for simple API access and were documented to a degree that product suggestions in an online retail store using real- allowed us a basic understanding of the whole platform. world data. Preliminary results are quite promising Available options for development and implementation of achieving Mean Average Precision of 6 %. custom algorithms were also a priority and significant advantage of a platform. 1 INTRODUCTION BigML [3] is a commercial solution exclusively based on decision trees and therefore very suitable for classifications. A challenge that many retailers are facing today in the It can be tested freely with smaller amounts of data and has saturated online market is how to gain a competitive a streamlined process for creating decision tree models. The advantage and obtain cost-effective recommendation interface is highly intuitive and the model creation is a matter features without large investment into machine learning of a few clicks. The main strength of it is the simplicity of research and development. This challenge is difficult as we usage and useful visualization of the generated decision tree are getting large amounts of data even in smaller web model. It is however limited to decision trees, which are not applications. Therefore, it is expected that the solution is suitable for recommendation problem. simple to implement, fast, distributed and scalable [1]. QMiner [4] is an open source platform that offers high levels As a solution to this challenge, cloud-based recommendation of customization, a decent amount of available algorithms, systems offer recommendation-as-a service. These solutions and is well documented. It is implemented in C++ and offers are an emerging trend lately, with an open source server a JavaScript API. The architecture is distributed and solution just recently raising $2.5 million in seed funding [2]. customizable. However, the platform is still in development The main idea is to provide a recommendation engine in a phase and faces some difficulties with deployment in cloud as a solution, whereas a retailer provides data in production environment. utilizes the output form the cloud. PredictionIO [5] is an open source solution implemented in The aim of this paper is twofold. The first goal is to review, Scala and offers APIs in most of the popular languages. It compare and evaluate several cloud-based recommendation shares similarities with QMiner’s distributed architecture systems. The second goal is to implement a reference item- and consists of four layers. At the bottom, there is a recommendation system that can be used in a real-world MongoDB database, followed by Hadoop framework for application. As a result, the paper aims at a fully working scalable distributed computing. The third layer is the heart of solution that is able to address the major challenges – the prediction model generation process – Apache Mahout, a scalability together with simple API access and scalable machine-learning library with many popular implementation. algorithms already implemented. The top layer contains an Preliminary results show the level of accuracy we can obtain API, which offers simple access to the prediction server. by using the default set of algorithms and parameters on real- Due to simple setup process, its open source nature and many world data. These results can be later used as a baseline available machine-learning algorithms, we chose orientation point in further comparison with custom PredictionIO for further steps. solutions. Other solutions such as SensePlatform [6], Google Prediction API [7] and Microsoft Azure ML [8] are also 85 available commercial solutions that mainly focus on cloud- based implementation of core machine-learning algorithms. 3 DATA We obtained real-world data containing orders in an online retail store in a period of one year. The data was described by item (product) data, user (consumer) data and user-item interactions (item bought) data, for example, user U1 made an order O1 in which items A,B,C,D were bought. In total, there were around 10.000 items, 36.000 users and 300.000 user-item interactions. There were some minor occurrences of missing data as well as some duplicated entries, which varied for a single character, yet represented the same product. Those entries were simply filtered with basic manual corrections. The only preprocessing included replacing the encoding of the original dataset with utf-8 encoding. 4 ITEM-BASED RECOMMENDATION Our goal was to develop a recommendation system by leveraging the PredictionIO built-in engines. Figure 1 shows the PredictionIO architecture comprising several engines. Figure 1: PredictionIO server architecture [14]. Each engine processes data and constructs a predictive model independently. There can be several engines within a Collaborative filtering [11, 12] is among the most used and single application, where each of them will serve its own successful methods for this type of recommendation prediction results based on corresponding predictive model. systems. Collaborative filtering finds the users with similar Each engine can be configured with a variety of options and preferences (user based) in such a way that it finds items, parameters for fine tuning, such as preference for newer which were similarly rated by other users (item based). items, preference for surprising discovery, custom attributes User based approach has some issues, especially with and most notably the goal to be maximized through scalability, since computation grows with both the number predictions. Based on the selected goal, which can be any of users and the number of items; hence, item-based action ( like, view, conversion) or a rating threshold (e.g. approach is more common due to its simplicity and better rating >= 3), it is possible to evaluate the available scalability. algorithms using built-in interface. The basic idea of the item based approach is to take the items some user has rated/bought and computes how similar they 4.1 Recommendation engines are to other items. Based on this similarity it then selects k PredictionIO server offers two recommendation engines: most similar items. item recommendation engine and item similarity engine. Mahout’s item based collaborative filtering implementation Each engine builds a prediction model based on the is based on the pseudo-code shown in Figure 2. The underlying algorithm. In both cases, prediction model is algorithm computes similarity between pairs of items, where generated using Mahout’s Collaborative Filtering [9] one item of the pair is an item already preferred by a user, methods. and the other item of the pair is not. Mahout’s kNN (k-nearest neighbors) Item Based Collaborative filtering was used in Item recommendation engine and Mahout’s Item Similarity Collaborative Filtering was used in Item similarity engine. Both are implemented to be run either on a single machine or multiple machines in a distributed/scalable setting. Figure 2: Pseudo-code for recommendation algorithm. To demonstrate the general algorithm from Figure 2 on an e- commerce example, it tells us that for each product that is not in a user’s shopping cart yet, it takes each product already in the shopping cart and computes the similarity between these pairs of products. It then weighs the computed preference with frequency of common occurrences. 86 7,00% 4.2 Mean average precision as evaluation criteria For result evaluation, PredictionIO offers built-in 6,00% evaluations where we can regulate the size of the learning 5,00% and testing set and also some other parameters such as the number of predictions and number of iterations. These 4,00% simulated evaluations use Mean Average Precision (MAP) 3,00% to measure accuracy [10]. The critical step is the evaluation of similarity. PredictionIO 2,00% offers several similarity evaluation options, such as 1,00% Pearson’s correlation, Cosine similarity, Jaccard coefficient, Log-likelihood ratio etc. This offers a simple option to 0,00% compare the results each of them gives. In our case, Log- Random Item Item similarity likelihood ratio was chosen as it fits our problem of product recommendation recommendation engine recommendation best. engine This similarity measure is based on finding and counting the Figure 3: Comparison of MAP prediction accuracy. cases, where two items appear together [15] and is similar to an expanded co-occurrence matrix. If we want to measure the precision of the prediction for n 6 REFERENCE IMPLEMENTATION “best recommendations” for some user, we use MAP@n: Based on the comparison presented in the previous section, we reused the best model and implemented PredictionIO MAP@݊ ൌ ∑ே ܽ݌@݊ ௜ୀଵ ௜ /ܰ, within an Ajax based Django application, which implements a demo e-commerce application. This allowed us to evaluate where N is the number of all users, while ap@ ni is the the response time and application behavior and reliability in average precision for user i and is defined as: a test user environment. ap@݊ ൌ ∑௡ ܲሺ݇ሻ∆ܴሺ݇ሻ ௞ୀଵ , where P(k) is the precision with recommendation of k items, which means the number of correctly recommended items from the first k recommendations. Δ R (k) is the change of recall in step k, which means 1/n if the k-th item was recommended correctly, and 0 otherwise. [10] 5 EXPERIMENTAL RESULTS We ran the evaluation in three iterations, each prediction offering 20 items, with 70% of total data being the training set and 30% being the testing set. The evaluation was first ran for a random recommendation algorithm in order to get a baseline with which we can compare our results. We then Figure 4: Entry site of our demo e-commerce application [13]. ran it for item recommendation engine and item similarity engine, using the above-mentioned Collaborative filtering algorithms with Log-likelihood as a similarity measure. The application offers available products to a user dynamically upon input as seen on Figure 4 ( 1. Iskalno The best result was 6% MAP accuracy using item similarity okno), which allows us to prevent invalid entries and engine as seen on Figure 3. This is notably better than the improve the user experience. Upon selection a set of five 0.1% MAP accuracy using baseline random products is returned with minimal response time as seen on recommendation. For our el-commerce example this means Figure 5 ( 3. Priporoč eni izdelki). that given 100 recommended products, 6 of these would actually be chosen by a customer and added in their shopping cart. 87 with product recommendation service based on real world data. The application is capable of taking a set of products chosen by the customer and returning a set of n products, which are predicted to be the most likely to be bought by this customer. The prediction model utilizes Prediction.IO platform using Mahout’s Item Based Collaborative Filtering, using Log-likelihood ratio as a similarity measure. The preliminary experiments showed that we can expect up to 6% MAP accuracy of predictions. This is an important result, which can be used as a baseline value in future work. It allows us to compare a custom-developed recommendation algorithms with a basic out-of-the-box solution. The experiment established grand basis for further research, that is, full-stack architecture, modeling, evaluation, and testing. Further work will include implementation of a Figure 5: Application offering 5 products to a user who selected a custom engine for prediction and model generation, which random product [13]. could be easily used in the prebuilt architecture. As displayed on Figure 6, client connects to a web server References running on a remote machine. If no products are chosen, the server returns a simple html site. When a client selects a [1] Very Large Data Bases Endowment Inc., A. Labrinidis, product from the list, the server detects it and collects the H.V. Jagadish, Challenges and Opportunities with Big Data data from the shopping cart in form of strings. This data is . sent to PredictionIO. PredictionIO processes the data, applies http://vldb.org/pvldb/vol5/p2032_alexandroslabrinidis_ a prediction model and then returns a JSON response vldb2012.pdf, 2014-09-05 containing recommendations to the web server. The [2] Techcrunch, S. O’Hear , PredictionIO Raises $2.5M For Its Open Source Machine Learning Server. webserver parses the response and displays it to the client. It is important to note that minimal coding is required for http://techcrunch.com/2014/07/17/predictionio/, 2014- interaction with PredictionIO. 09-05 [3] BigML, Inc. https://bigml.com/, 2014-09-05 [4] QMiner. http://qminer.ijs.si/, 2014-09-05 [5] PredictionIO. http://prediction.io/, 2014-09-05 [6] Sense. https://senseplatform.com/, 2014-09-05 [7] Google Developers, Google Prediction API. https://developers.google.com/prediction/?hl=sl, 2014- 09-05 [8] Microsoft Azure Machine Learning. http://azure.microsoft.com/en-us/services/machine- learning/, 2014-09-05 [9] The Apache Software Foundation, Apache Mahout. https://mahout.apache.org/users/recommender/ recommender-documentation.html, 2014-09-05 [10] Kaggle Inc. https://www.kaggle.com/wiki/ MeanAveragePrecision Figure 6: UML diagram of client - web server - PredictionIO [11] T. Segaran, Programming Collective Intelligence, interaction. 2007 [12] B. Sawar, G. Karypis, J. Konstan, J. Riedl, Item-Based A single machine implementation offered good response Collaborative Filtering Recommendation Algorithms. time for a small set of users trying to access the application http://www.ra.ethz.ch/cdstore/www10/papers/pdf/p519 simultaneously. In order to achieve the scalability and serve .pdf, 2014-09-05 multiple requests per second, we would have to deploy [13] Demo e-commerce application. http://prediction- multiple processing machines. io.ijs.si:8001/ [14] Basics of PredictionIO. 7 CONCLUSION http://docs.prediction.io/current/concepts/basics.html We overviewed and compared a set of distributed, scalable [15] T. Dunning, Surprise and Coincidence. cloud-based platforms for machine learning. The goal we http://tdunning.blogspot.com/2008/03/surprise-and- achieved is a fully functional demo e-commerce application coincidence.html, 2014-09-05 88 NOVEL IMAGE PROCESSING METHOD IN ENTOMOLOGY Martin Tashkoski, Ana Madevska Bogdanova Ss. Cyril and Methodious University, Faculty of Computer Sciences and Engineering, Rugjer Boshkovikj 16, 1000 Skopje, Macedonia Tel: +389 2 3070377; fax: +389 2 3088222 e-mail: tashkoskim@yahoo.com, ana.madevska.bogdanova@finki.ukim.mk ABSTRACT recognition that will stop the transfer of contaged plants and fruits at the border. Image processing and machine learning together offer Following our previous work [CIIT 2014], in this paper powerful methods for image classification. In this paper we propose a new method - Self – removing Noise Method we present novel way of processing microscopic images (SRNM) for automatic image background cleansing. We for automatic classification of two similar insects shortly describe the method for solving the problem for belonging in the family Aleyrodidae, superfamily distinction of the whiteflies using image processing and Aleyrodoidea (whiteflies). They are very similar and can machine learning techniques, and give some results gained be distinguished only in a certain stage of their from different tests. The rest of the paper is organized as development (fourth larval stage or “pupae stage”). follows. In Section 2 we present some of the work related to Following our previous work, we propose a novel image our problem. In Section 3 we present the entomology processing method for automatically removing images’ problem and the basic idea for solving this problem. The background noise. We also present results of a results of the different test are presented in Section 4, and classification process using the processed images with the conclusion and future work are presented in the final the proposed method. Section 5. 1 INTRODUCTION 2 RELATED WORK Microscopic image processing dates back half a century In this section we briefly present some work related to the when it was realized that some of the techniques of image problem of these two whiteflies and some techniques and capture and manipulation, first developed for television, existing software for image processing. could also be applied to images captured through the The authors in [3] discovered the Bemisia tabaci and microscope. Trialeurodes vaporariorum in Macedonia. They explain the Since agriculture has been important to people danger of these pests and guess that these whiteflies were thousands of years, we focus on microscopic images of transferred from other neighboring countries. pests that harms agricultural crops. Authors in [6] explain all of the stages and the physical In this paper we present one of the pests’ problems – look, that can be useful for understanding the differences of distinguinshing between two whiteflies lat. Trialeurodes the whiteflies and choosing the best indicator for their vaporariorum (less dangerous) and lat. Bemisia tabaci distinction. (more dangerous). They are small insects (about 1mm long) The first step for solving this problem is undertaken in with wings and bodies all covered with white, powdery wax [5]. The authors proposed the algorithm Symmetrical Self – [1]. They are fed with the plants juice and as a consecuence, Filtration that can extract the important (vasiform orifice) they may reduce vigour and growth of the plant [2]. part from the microscopic images in pupae stage. This is the Bemisia tabaci is more dangerous than Trialeurodes insect part we use for the classification process in our work. vaporariorum because their larvae can inject some enzymes Authors in [6] present an important insight in explaining into the plant and those enzymes cause chlorosis or uneven various methods for processing microscopic images. Their ripening (depending on the plant), and induce physiological work on this specific area of images is very useful, because disorders [2]. often there are dust and particles that appears as noise in the In Macedonia, these whiteflies can be found in the image and the extraction of some information is very southern parts and their occurrence has been observed 5 – 6 difficult task. years ago [3]. It is assumed that they are transferred from The authors in [7] and [8] are using Wolfram neighboring countries by importing various plants that Mathematica to solve some biological problems with image causes problems with their spreading. processing. The author in [7] presents a solution for The border custom control has developed interest in measuring the locations of particles of microscopic images. preparing an intelligent system for dangerous pest It is about needle – free injection devices that fire powdered 89 drug particles into the skin on a supersonic gas shock wave. Thin slices of a target are photographed by microscope, and on that images are applied different filters and the noise of the images is removed to find the locations of the particles. The author in [8] analyze a microscopic image of red blood cells and count them, using morphological operations and measurement tools. 3 THE METHODOLOGY In this section we present the entomology problem at hand and the methodology used for creating a descriptor, the image processing process, removing the background of the images, and the classification process. Although the whiteflies look similar, they have differences in each stage of their development. But the problem is that these whiteflies can physically change their Figure 2: Microscopic images of larva of Trialeurodes look depending on the plant they feed themselves with, vaporariorum their environment and its temperature. According to [2], the accurate indicator for distinction is based almost entirely on 3.2 Image Preprocessing their fourth larval stage or “puparial stage”. First of all, the captured images were cropped by applying the algorithm for Symmetrical Self – Filtration [5], and as an output we obtained the images with the characteristic part (vasiform orifice). These images were used as input to our method for automatically removing the background. 3.3 Image Processing - Self – removing Noise Method (SRN Method) Figure 1: Bemisia tabaci and Trialeurodes vaporariorum, closer look to “vasiform orifice” The new method proposed in this paper, the Self – removing Noise Method, removes the noise and the On Figure 1 we can see the two larvae of both whiteflies background of the processed microscope images obtained with their characteristic parts. The parts of the larvae are: as output in 3.2. in order to prepare the images for the operculum (1), vasiform orifice (2), lingula (3), caudal classification process. The vasiform orifice part is not furrow (4), caudal seta (5). The characteristic part that is the clearly presented in all of the images. Some of them are best indicator for distinction of these two whiteflies is really noisy. For solving this problem we developed a vasiform orifice (2). Vasiform orifice of Bemisia tabaci is method that proceeds all of the images, applies some filters, thinner than the one of Trialeurodes vaporariorum. and after that as an output returns the images with the vasiform orifice part in a white background. 3.1 Obtaining the training and test set We will present the basic steps (1-6) of the method In order to develop an Entomology classification System, applied on all of the images - Figure 3. the substantial part of the solution was to find appropriate The SRN Method is as follows: images of these two whiteflies. The images of Bemisia tabaci were found on the Internet database. Since there is no Step 1: importing the images with the vasiform orifice available data base for the second whitefly, we had to make part; an image collection of ourown. The whitefly Trialeurodes Step 2: converting each of the images to “grayscale” vaporariorum was found in greenhouse in the southern parts image; of Macedonia, on the plants cucumber and tomato. There Step 3: producing 3 categories for each image and are several steps of taking these images: preparation of the classifying the pixels according to their color intensity biological samples, awareness of the mechanical damage of (black, grey and white); the samples, type and quality of the microscope. The most Step 4: applying bilateral filter to the images; difficult problem was that each larva was covered with dust Step 5: morphological image processing; which appeared as a noise at the image. At Figure 2 there Step 6: converting each of the images to binary image; are microscopic images with different zoom levels of the Step 7: deleting the isolated small groups of black larva. The top image is larva captured before removing the pixels. dust. 90 Figure 5 present some of the accepted images that were obtained as outputs of the SRN Method. Figure 5: Accepted images (the first row are images of Bemisia tabaci, the second row are images of Trialeurodes vaporariorum) Figure 3: The basic steps in the Self – removing noise Method for removing the background 4 TESTS AND RESULTS In Step 1, we import the images with the vasiform orifice The following tests we undertaken using the code written in part. The second step converts the images to “grayscale” Visual Studio 2010 in C# presented in [4]. The processed style. images obtained with the proposed SRN Method are used as input for the code, and the output is a file with In Step 3, for each image we form 3 dimensional vector parameters formatted properly for Weka and SVM light. As of pairs, in order to find the parameters that characterize the we presented in paper [4], the best descriptor contains the image the best way according their color intensity. Because following parameters: five different widths of the image the images were converted to “grayscale” style in Step 2, the (number of pixels from the image in one row, taken pixels values variy in scale of 0 to 1 (0 for black and 1 for consecutively from top to bottom through the image on white). First pair represents the number of pixels that has equal distances), height, and ratio (height/bottom_width). values 0 – 0.33 , the second pair – number of pixels with The SRN Method was used on 346 images total, 49 of values 0.33 – 0.67, and the third pair – number of pixels Bemisia tabaci and 297 of Trialeurodes vaporariorum. The with values 0.67 – 1 (Figure 3). This information is used in images of Trialeurodes vaporariorum were gathered from the following steps to adjust the appropriate filters. different sources and contained different background The fourth step is about applying bilateral filter on each colors. The script as output obtained 326 images total (39 of of the images, as it is shown at Figure 3. The bilateral filter Bemisia tabaci and 287 of Trialeurodes vaporariorum) [9] is a non – linear, edge – preserving and noise – reducing ready for classification and 20 rejected images. filter for images. The intensity value at each pixel in an image is replaced by a weighted average of intensity values 4.1 Classification in Weka and SVM light from the nearby pixels. This weight can be based on a Gaussian distribution. We made 3 groups with 10 tests and used different In Step 5 we made some morphological image classifiers in Weka and SVM light [11]. For the first two processing. We used the method of closing as the basic rule groups (presented in [4]) we had 109 images total with for removing noise. Closing removes small holes and it manually removed background (48 images of Bemisia tabaci tends to enlarge the boundaries of bright regions in the and 61 of Trialeurodes vaporariorum). In the first group we image and shrink background color holes in such regions. maintain even ratio in the test folder (we use 10 instances of The sixth step is converting each of the images to binary both classes for testing), and in the second group we images, and after that in the last step we use function for maintain even ratio in the training folder (we use 40 deleting the small groups of black pixels. instances of both classes for training). For the third group we Figure 4 presents some rejected images. The method had 326 images total with automatically removed counts the pixels at the borders (up, down, right and left). If background (39 images of Bemisia tabaci and 287 images of there are more black pixels than a half of the border length, Trialeurodes vaporariorum). In this group we maintain the these images are rejected for the classification process as even ratio in the test folder (we use 10 instances of each undistinguishable. class for testing). 4.2 Results In this section we present results of the classification process using the images obtained with the proposed SRN Method i.e. with automatically removed background. We maintain the even ratio in the test folder. In the paper [4] we have published the classification Figure 4: Rejected images results of manually cleaned images where we maintained the even ratio in the test folder. According the average best 91 results in Weka, for Bemisia tabaci and Trialeurodes using manually cleaned images in the classification process. vaporariorum were obtained with the classifier lazy.IBk We will extend our set of images of both pests and the final (90% correctly recognized instances of Bemisia tabaci, and step will be connecting all of these methods into one 95% correctly recognized instances of Trialeurodes integrated system for fast recognition of the pests with an vaporariorum). According to the average best results in user - friendly interface. We also believe that the database of SVM light for correct recognition of Bemisia tabaci were the whitefly Trialeurodes vaporariorum we created, will be obtained with the RBF kernel (for gamma=0.01) 89%, and usefull to the other entomology researchers. best results for correct recognition of Trialeurodes vaporariorum were obtained with all of the kernels except References the RBF kernel (for gamma=0.01) 97%. [1] G. S. Hodges, G. A. Evans. An identification guide to For the new group of tests with the SRN method, we the whiteflies (Hemiptera: Aleyrodidae) of the have produced 10 tests, and for each test we have taken 10 southeastern United States. Florida Dept. Agriculture, instances of Bemisia tabaci and 10 instances of Trialeurodes Division of Plant Industry, Gainesville, FL 32614. vaporariorum for testing of the total set with 326 instances. 2005. For every test we performed classifications in Weka (for [2] C. Malumphy. Protocol for the diagnosis of quarantine different classifiers) and in SVM light (for different kernels). organism Bemisia tabaci (Gennadius). Central Science According the average best results (Figure 6) in Weka Laboratory, Sand Hutton, York YO41 1LZ, UK. for Bemisia tabaci, were obtained with the classifiers [3] С. Банџо, Р. Русевски. Bemisia tabaci Genad. Bayes.BayesNet (78% correctly recognized), and best Присуство и распространување во Република results for correctly recognizing Trialeurodes vaporariorum Македонија, XXXI - во традиционално советување were obtained with the classifier trees.J48 - 89%. за заштита на растенијата на Република Македонија, According to the average best results (Figure 6) in SVM Охрид, 2006. In Macedonian. light for correctly recognizing Bemisia tabaci were obtained [4] M. Tashkoski, A. M. Bogdanova. Image classification with the RBF kernel (for gamma=0.001) 70%, and best in Entomology. In: Proceedings of 11th International results for correct recognition of Trialeurodes vaporariorum Conference on Informatics and Information were obtained with all of the kernels except the RBF kernel Technologies, CIIT 2014. (for gamma=0.001) 85%. [5] М. Киндалов. Препознавање на слики во ентомологијата и практична имплементација со Bemisia tabaci и Trialeurodes vaporariorum. Дипломска работа, ментор: проф. Д-р Ана Мадевска Богданова, Скопје (2009). In Macedonian. [6] Q. Wu, F. A. Merchant, K. R. Castleman. Microscope image processing. ISBN: 978-0-12-372578-3. Elsevier Inc. 2008. [7] J. McLoone. Buiding a microscope application in Mathematica, 2011. http://blog.wolfram.com/2011/09/09/building-a- microscopy-application-in-mathematica/ [8] S. Ashnai. How to count cells, annihilate sailboats, and warp the Mona Lisa, 2012. Figure 6: Average of the results in Weka and SVM light with http://blog.wolfram.com/2012/01/04/how-to-count- even ratio of the both classes in the testing set (for cells-annihilate-sailboats-and-warp-the-mona-lisa/ automatically cleaned images) [9] C. Tomasi, R. Manduchi. Bilateral Filtering for Gray and Color Images. Print ISBN: 81-7319-221-9. Proceedings of the 1998 IEEE International Conference 6 CONCLUSION on Computer Vision, Bombay, India, 1998. [10] M. H. Malais, W. J. Ravensberg. Knowing and In this paper we proposed a new method for automatically recognizing - The biology of glasshouse pests and their removing background of the images of Bemisia tabaci and natural enemies. Publisher: Koppept Biological Trialeurodes vaporariorum - Self – Removing Noise Method Systems. ISBN 90 5439 126 X. 2003. in order to develop an automatic system for classification. [11] T. Joachims. Optimizing Search Engines Using We tested with several classifiers, using the images that we Clickthrough Data, Proceedings of the ACM obtained with this method and we presented the results. Conference on Knowledge Discovery and Data Mining According the new results we can conclude that the Self – (KDD), ACM, 2002. removing noise method proved as a good solution for automatically cleaning the images. The SRN Method that we used for filtering the images will be improving in the future in order to obtain results approximate to the results when 92 ARHITEKTURA SISTEMA OpUS Aleš Tavč ar1,2, Jure Šorn1, Tea Tušar1, Tomaž Šef1, Matjaž Gams1,2 Odsek za inteligentne sisteme, Institut »Jožef Stefan« 1 Jamova cesta 39, 1000 Ljubljana, Slovenija Mednarodna podiplomska šola Jožefa Stefana2 Jamova cesta 39, 1000 Ljubljana, Slovenija e-mail: {ales.tavcar, tea.tusar, tomaz.sef, matjaz.gams}@ijs.si POVZETEK Sistem OpUS, ki je bil razvit v okviru projekta e-storitve Obstoje za gospodarstvo, je samostojna in robustna rešitev, ki se či sistemi hišne avtomatizacije oz. pametnih stavb ne omogo lahko vgradi v široko paleto obstoječih sistemov hišne čajo naprednih funkcionalnosti, ki bi jih od takih sistemov pri avtomatizacije. Združuje več inovativnih komponent, kjer čakovali. Trenutne rešitve omogo vsaka skrbi za določen vidik inteligentnega upravljanja čajo zgolj spremljanje stanja sistemov in okolja v hiši ter krmiljenje hišnih naprav preko mobilnih naprav pametne stavbe. in spleta. Definiranje urnikov delovanja je prepuš V nadaljevanju prispevka najprej opravimo kratek čeno samim uporabnikom, kar se običajno odraža v večji pregled sorodnega dela. Nadaljujemo s predstavitvijo porabi in neučinkovitem delovanju. V pričujočem arhitekture sistema OpUS in na kratko opišemo naloge prispevku so predstavljeni načini za nadgradnjo posameznih komponent sistema. sistemov hišne avtomatizacije z inteligentnimi metodami učenja navad uporabnikov in metodami optimizacije 2 SORODNI SISTEMI delovanja. Tak sistem je zmožen spremljanja obnašanja Obširen pregled sodobnih sistemov vodenja v pametnih uporabnikov, se učiti njihovih navad in prilagajati zgradbah je opisal Dounis et.al [1]. Za razliko od klasičnih delovanje glede na spreminjajoče se življenjske navade načinov vodenja, je za optimalno, prediktivno ali adaptivno in potrebe. vodenje potrebno imeti model zgradbe. Temu pristopu sledimo v tem prispevku. 1 UVOD Vodenje sistemov z uporabo podatkov in znanj o Trenutne komercialne rešitve hišne avtomatizacije ne uporabnikih in okolju predstavlja nove smernice raziskav in ponujajo naprednih funkcionalnosti, ki bi jih pričakovali v razvoja tako imenovanega vseprisotnega in prodornega takih sistemih. Namesto pametnega predvidevanja navad in računalništva (ang. ubiquitous computing, pervasive potreb uporabnika obstoječi sistemi ponujajo zgolj computing), saj sodobne naprave, senzorji in aktuatorjih, ki krmiljenje hišnih naprav ter nadzor stanja v zgradbi. Slednji se vse bolj množično pojavljajo v zgradbah (senzorji poteka preko različnih pametnih naprav in v redkih primerih prisotnosti, senzorji gibanja, senzorji odprtosti oken, preko spletnih vmesnikov. Nastavljanje režima delovanja in senzorji na mobilnih telefonih, osebne vremenske postaje scenarijev je prepuščeno samim uporabnikom, iz česar itd.) omogočajo beleženje najrazličnejših informacij in običajno sledi, da so urniki nastavljeni površno in zato kopičenje znanj tako o obnašanju posameznega uporabnika, neučinkovito. Poleg tega se navade uporabnikov neprestano kot o obnašanju sistema. Uporaba takšnih znanj se izkorišča spreminjajo. V določenem obdobju lahko npr. uporabniki v sistemih, ki spodbujajo uporabnike k zmanjšanju porabe začnejo prihajati domov kasneje; ob nespremenjenem energije s spodbujanjem k na primer nižanju želenih urniku to pomeni, da se začne ogrevanje hiše prezgodaj, kar temperatur ogrevanja ali pa k izbiri primernih prostorov v se odraža v večji porabi energentov in posledično v višjih službi za potrebe sestankov (manj ljudi - manjši prostor - obratovalnih stroških. Poleg tega se vedno bolj uveljavlja manj energije za ogrevanje) [3]. Znanje o uporabnikih se uporaba fotovoltaike kot komplementaren vir energije. izkorišča za gradnjo modelov uporabnikovega obnašanja in Predpostavljamo, da lahko z inteligentnim kombiniranje uporabo le-teh pri vodenju in adaptaciji sistemov ogrevanja, različnih virov energije zmanjšamo stroške obratovanja in razsvetljave, prezračevanja in ogrevanja sanitarne vode hkrati ohranimo zadovoljivo stopnjo udobja uporabnikov. [4,5]. Prihranki energije se gibljejo med 5-30%. Za reševanje zgornjih nalog predlagamo sistem, ki je Veliko projektov na temo izvedbe testnih pametnih hiš sposoben sprotnega spremljanja dogajanja v hiši, učenja in stanovanj je bilo že dokončanih. Leta 1990 so izdelali navad uporabnikov, prilagajanja delovanja glede na Neural Network House [6], kjer so uporabljali nevronske spremembe v navadah uporabnikov in optimiziranja mreže za inteligentno vodenje sistemov. Sledila sta delovanja celotnega sistema hišne avtomatizacije. IHome[7] in MavHome[8], temelječa na inteligentnem več- agentnem pristopu nadzora in vodenja sistemov z uporabo 93 tehnik za modeliranje in napovedovanje uporabnikovega Zadnja dva tipa agentov sta še podporni in obnašanja in akcij. Gator Tech Smart House [9] je splošno komunikacijski agenti, ki so lastni arhitekturi in skrbijo za uporaben študijski projekt za raziskavo tehnik vseprisotnega prenašanje sporočil med agenti, beleženje posameznih računalništva (ang. pervasive computing). Eden zadnjih agentov, posredovanje podatkov, ontologij ipd. Simbolična projektov - ThinkHome [10] uporablja širok nabor shema arhitekture je prikazana na sliki 2. Sama arhitektura podatkov o okolju, vremenu in uporabniku za namene je zasnovana tako, da je omogočeno enostavno dodajanje in študije vodenja pametnih domov. odstranjevanje agentov v sistem. Vsi sistemi se povečini osredotočajo zgolj na določene vidike upravljanja stavbe. ThinkHome, na primer, poskuša predvidevati temperaturo, ki bo za uporabnika najudobnejša. Poleg tega poskuša predvidevati, kdaj bo nek uporabnik prisoten. Sistem OpUS je obširnejši, saj poskuša modelirati večje število parametrov pametne hiše obenem pa v algoritme vodenja vključuje uporabniške navade in optimizacijske algoritme. 3 ARHITEKTURA SISTEMA Arhitektura sistema OpUS je definirana s hierarhično urejenim več-agentnim sistemom [11]. Agenti so avtonomne entitete, ki so sposobne zaznavanja in interakcije z okolico skladno z njihovimi preferencami, Slika 1: Hierarhič na več -agentna arhitektura. lastnostmi in aktivnimi cilji. Sposobni so samostojnega Sistem je logično razdeljen na dva dela, kot je prikazano na razmišljanja in sodelovanja za dosego skupnih ciljev. Vsak sliki 3. Zgornji nivo vsebuje tako imenovano inteligenco, agent v OpUS agentni arhitekturi je določen z agentno torej agente, ki skrbijo za generiranje novih, za uporabnika ovojnico, ki natančno definira posameznega agenta. ustreznejših urnikov. Spodnji nivo (t. i., hrbtenica) vsebuje Ovojnica določi vhodne podatke, ki jih agent zahteva, agente vodenja, ki skrbijo za izvajanje urnikov in sprotno akcije, ki jih lahko izvaja in izhodne podatke, ki jih lahko sinhronizacijo z realnim okoljem. Pomembna predpostavka posreduje. v sistemu je ta, da lahko sistem ob morebitnem izpadu 3.1 Hierarhično urejena agentna arhitektura zgornjega dela še vedno nemoteno deluje in skrbi za upravljanje z napravami v zgradbi. Agentna arhitektura ni samo skupek agentov, ki določa hierarhične odnose med njimi, ampak skrbi za beleženje stanja vsakega agenta, določa način komunikacije med njimi in omogoča podporo za sprotno simuliranje dinamike v prototipnem okolju. Sistem OpUS sestavlja šest različnih tipov agentov. Agent pasivna naprava je naprava, ki zgolj beleži in posreduje določen podatek. Primer takega agenta je senzor temperature zraka. Agent aktivna naprava je naprednejši in omogoča upravljanje z napravo. Primer takega agenta je toplotna črpalka, ki se ga lahko vklaplja, izklaplja in nastavlja na določeno temperaturo. Naslednji tip agentov je program, slednji združuje agente, ki ponujajo različne servise znotraj arhitekture. V to kategorijo spadajo agent dnevnik, ki beleži spremembe v sistemu, agent upravljanja Slika 2: Povezovanje agentov znotraj arhitekture. sistema, agenti za učenje, agenti za optimizacijo, simulacijo ipd. Četrti tip agentov hišnik so upravljavski agenti, ki 3.2 Učenje navad uporabnika omogočajo nadzor in upravljanje določenih sklopov pametne stavbe. Odvisno od razdelitve in same hierarhične V sistem OpUS je vključen modul za učenje navad strukture, lahko ti agenti upravljajo sistem ogrevanja, uporabnika. Slednji poskuša zgraditi model obnašanja določeno sobo ali pa celotno zgradbo. Agent hišnik, ki je v uporabnika na podlagi opazovanja prisotnosti v hiši in v agentni hierarhiji postavljen najvišje, skrbi tako za posameznih sobah. Zgrajeni model lahko za določeno koordinacijo med ostalimi višjenivojskimi agenti, kot tudi obdobje napove verjetnost, da je uporabnik prisoten ali za komunikacijo z zunanjim svetom. Zgradba je namreč odsoten. Napoved prisotnosti uporabnika omogoči celo lahko vključena v pametno mesto, ki lahko od nje zahteva vrsto dodatnih storitev, ki so v sodobnem sistemu hišne aktivno pogajanje o cenah energentov ipd. avtomatizacije nujne. Natančna napoved časa odhoda uporabnika omogoča varčevanje z energijo, saj lahko sistem 94 predčasno izklopi ogrevanje. Podobno lahko natančna večkriterijski optimizacijski algoritem poišče množico napoved časa prihoda omogoči povečanje občutka udobja urnikov, ki predstavljajo kompromisne rešitve glede na pri uporabnikih. obravnavana nasprotujoča si kriterija. Dobljene rešitve Gradnja modela poteka tako, da sistem dlje časa beleži predstavljajo približek za t.i. Pareto fronto [12], na osnovi senzorske podatke o prisotnosti in sproti gradi verjetnostni katere nato uporabnik preko uporabniškega vmesnika izbere model navad uporabnika. Prednost sprotne gradnje je v tem, nov, zanj najustreznejši urnik. da omogoča prilagajanje na spremembe v navadah Algoritem gradi množico rešitev iz začetnega urnika, s uporabnika. Sistem nato dnevno uporabi zgrajeni model za preiskovanjem celotnega prostora urnikov in uporabo napoved časa odhoda in prihoda uporabnika. S tem simulatorja za ovrednotenje generirane rešitve iz vidika optimizacijskemu agentu zmanjša preiskovalni prostor, saj porabe in udobja. Tekom izvajanja se generirane rešitve določi intervale, kjer z večjo verjetnostjo pride do izboljšujejo in najboljše rešitve z vidika obeh kriterijev sprememb v režimu delovanja. tvorijo Pareto fronto nedominiranih urnikov. Uporabnik lahko preko uporabniškega vmesnika izbere enega od teh 3.3 Model in simulacija zgradbe urnikov glede na preference o načinu vodenja stavbe. Za Sposobnost natančnega simuliranja realnega okolja izvajanje izbranega urnika poskrbi agentna arhitektura, ki omogoča predhodno verifikacijo strategij vodenja in izbiro sočasno upravlja simulirano in realno prototipno okolje. tiste, ki vodi do najustreznejšega stanja sistema. V sistemu 3.5 Komunikacijski vmesnik OpUS je simulacija ključna za poganjanje optimizacijskih algoritmov in sprotnega izvajanja dinamike sistema. Eden pomembnih modulov v arhitekturi je komunikacijski Razviti simulacijski agent uporablja orodje vmesnik, ki skrbi za sinhronizacijo agentne arhitekture z EnergyPlus[13], ki omogoča simuliranje poljubnih modelov realnim kontejnerskim prototipom. OpUS preko HTTP stavb in različnih naprav. Ključna naloga je zgraditi model zahtevkov pridobiva informacije o stanju naprav stavbe, ki kar se da natančno odraža realno prototipno priključenih na kontrolerje. okolje. Orodje omogoča definiranje naprav s poljubnimi Primer zahtevka za branje temperature v prostoru: specifikacijami, omogoča integracijo fotovoltaike in http://www.exp.com/scgi/?c11025.ts01_temperature akumulatorjev. Za čim bolj natančno izvajanje simulacije _0 uporablja informacije o geografski lokaciji modela, realne informacije o vremenu in sončnem obsevanju. Simulacijski agent omogoča sprotno spreminjanje parametrov delovanja Pri čemer je »c11025« CyBro krmilnik z naslovom 11025, in izpis stanja naprav. Na sliki 3 je prikazana zaslonska ts01_temperature_o pa je ime spremenljivke v krmilniku. maska pregleda porabe posameznih naprav znotraj simulacije. Strežnik vrne odgovor v formatu XML: c11025.ts01_temperature_0 268 Measured internal temperature, multiplied by 10 (e.g. 247 means 24.7°C). Dobljene vrednosti se uporabljajo za sinhronizacijo agentne arhitekture z dogajanjem v realnem okolju, tako da se vrednosti v XML odgovoru prepišejo v ustreznega agenta- napravo. Vodenje sistema poteka na podoben način preko http Slika 3: Prikaz posamezne porabe tekom simulacije. zahtevkov. Na podlagi vrednosti v izbranem urniku pošlje sistem zahtevek za spremembo delovanja naprave. Prednost 3.4 Optimizacija delovanja takega pristopa je decentraliziranost, saj ni potrebno, da sta Sistem OpUS uporablja optimizacijo za izračun novih sistem OpUS in krmilnik locirana na istem strežniku. S tem urnikov, ki so za uporabnika ustreznejši in omogočajo tudi ločimo inteligentni del sistema od logičnega dela, ki manjše stroške in večje udobje. Z uporabo simulatorja skrbi zgolj za izvajanje ukazov na krmilniku. stavbe, ki za podani urnik vrača stroške in udobje, 95 review." Renewable and Sustainable Energy Reviews 13.6 (2009): 1246-1261. [2] R. Baños, F. Manzano-Agugliaro, F.G. Montoya, C. Gil, A. Alcayde, J. Gómez, Optimization methods applied to renewable and sustainable energy: A review, Renewable and Sustainable Energy Reviews, Volume 15, Issue 4, May 2011, Pages 1753-1766, ISSN 1364-0321, http://dx.doi.org/10.1016/j.rser.2010.12.008 [3] Laura Klein, Jun-young Kwak, Geoffrey Kavulya, Farrokh Jazizadeh, Burcin Becerik-Gerber, Pradeep Varakantham, Milind Tambe, Coordinating occupant behavior for building energy and comfort management using multi-agent systems, Automation in Construction, Volume 22, March 2012, Pages 525-536, ISSN 0926- 5805, http://dx.doi.org/10.1016/j.autcon.2011.11.012 [4] Lu, Jiakang, et al. "The smart thermostat: using Slika 4: Upravljanje z agenti aktuatorji. occupancy sensors to save energy in homes." Na sliki 4 je prikazan vmesnik za upravljanje z Proceedings of the 8th ACM Conference on Embedded napravami v sistemu OpUS in hkrati, preko Networked Sensor Systems. ACM, 2010. sinhronizacijskega mehanizma, v realnem prototipnem [5] Agarwal, Yuvraj, et al. "Occupancy-driven energy okolju. Na zaslonski maski je prikazana trenutna poraba management for smart building automation." naprav ter celotnega sistema, prikazani so podatki iz Proceedings of the 2nd ACM Workshop on Embedded senzorjev, stanje baterije in trenutna vremenska napoved. Sensing Systems for Energy-Efficiency in Building. Posamezne naprave je mogoče priklapljati in izklapljati ter ACM, 2010. nastavljati parametre delovanja. [6] M. C. Mozer, The neural network house: An environment hat adapts to its inhabitants, Proc AAAI 4 ZAKLJUČEK Spring Symp Intelligent Environments. pp. 110–114, (1998) V pričujočem prispevku smo opisali celotno arhitekturo [7] V. Lesser, M. Atighetchi, B. Benyo, B. Horling, V. L. sistema OpUS in opisali glavne module sistema. Uporaba M. Atighetchi, A. Raja, R. Vincent, P. Xuan, S. X. več-agentne paradigme olajša razvoj kompleksnih sistemov, Zhang, T. Wagner, P. Xuan, and S. X. Zhang. The kjer je pomembna sprotna komunikacija in učinkovito intelligent home testbed. In Proceedings of the sodelovanje med komponentami. Autonomy Control Software Workshop, (1999). Vseh pet glavnih agentov sistema je načrtovanih tako, [8] D. Cook,M. Youngblood, I. Heierman, E.O., K. da lahko uporabljajo storitve drugih agentov in hkrati Gopalratnam, S. Rao, A. Litvin, and F. Khawaja. ponudijo na razpolago svoje funkcionalnosti. Tipičen Mavhome: an agent-based smart home. In Pervasive primer je simulacijski agent, ki uporablja storitve vmesnika Computing and Communications, 2003. (PerCom 2003). ter podatke v agentni arhitekturi in ponuja storitve Proceedings of the First IEEE International Conference simulacije optimizacijskemu agentu ter omogoča sprotno on, pp. 521 – 524 (march, 2003). doi: izvajanje s strani agente arhitekture. 10.1109/PERCOM.2003.1192783. Takšen dinamičen in inteligenten sistem lahko znatno [9] S. Helal, W. Mann, H. El-Zabadani, J. King, Y. zniža porabo in s tem stroške inteligentne hiše. S tem pa Kaddoura, and E. Jansen, The gator tech smart house: a vzpodbuja ekološko ozaveščenost uporabnikov in programmable pervasive space, Computer. 38(3), 50 – zmanjšuje negativne vplive na okolje. 60 (march, 2005). ISSN 0018-9162. doi: Obstoječi tržni sistemi hišne avtomatizacije 10.1109/MC.2005.107. predstavljenih rešitev ne uporabljajo. Deloma zaradi višjih [10] C. Reinisch, M. J. Kofler, F. Iglesias, and W. Kastner, stroškov razvoja, kar se direktno preslika v ceno takega Thinkhome energy efficiency in future smart homes, sistema, deloma pa zaradi trenutno nezanesljivega in EURASIP J. Embedded Syst. 2011, 1:1–1:18 (Jan., nerobustnega delovanja. Sistem OpUS se uspešno spopada 2011). ISSN 1687-3955. doi: 10.1155/2011/104617 z navedenimi težavami in z uporabniškega stališča omogoča [11] Multi-Agent system: prijazno uporabo in veliko mero avtonomnosti. https://en.wikipedia.org/wiki/Multi-agent_system Reference [12] Kalyanmoy Deb. Multi-Objective Optimization Using Evolutionary Algorithms. Wiley, Nw York, 2001. [1] Dounis, Anastasios I., and Christos Caraiscos. [13] EnergyPlus. "Advanced control systems engineering for energy and http://apps1.eere.energy.gov/buildings/energyplus/energ comfort management in a building environment—A yplus_about.cfm 96 PREDICTIVE PROCESS-BASED MODELING OF AQUATIC ECOSYSTEMS Nina Vidmar1, Nikola Simidjievski2,3, Sašo Džeroski2,3 Faculty of Mathematics and Physics, University of Ljubljana, Ljubljana, Slovenia1 Jožef Sefan Institute, Ljubljana, Slovenia2 Jožef Stefan International Postgraduate School, Ljubljana, Slovenia3 e-mail: nina.vidmar@student.fmf.uni-lj.si, {nikola.simidjievski, saso.dzeroski}@ijs.si ABSTRACT number of aquatic ecosystems, such as: lake ecosystems [4, 5, 6, and 7] and marine ecosystems [3]. However, these In this paper, we consider the task of learning studies focus on obtaining explanatory models of the aquatic interpretable process-based models of dynamic systems. ecosystem, i.e., modeling the measured behavior of the While most case studies have focused on the descriptive system at hand, while modeling future behavior is not aspect of such models, we focus on the predictive aspect. considered. In contrast, Whigham and Recknagel [8] discuss We use multi-year data, considering it as a single the predictive performance of process-based models in a lake consecutive dataset or as several one-year datasets. ecosystem. However, either they assume a single model Additionally, we also investigate the effect of structure and focus on the task of parameter identification, or interpolation of sparse data on the learning process. We explore different model structures where the explanatory evaluate and then compare the considered approaches on aspect of the model is completely ignored. The method the task of predictive modeling of phytoplankton proposed by Bridewell et.al [9] focuses of establishing robust dynamics in Lake Zürich. interpretable process-based models, by tackling the over- fitting problem. Even though this method provides estimates of model error on unseen data, these estimates are not related 1 INTRODUCTION to the predictive performance of the model, i.e., its ability to Mathematical models play an important role in the task of predict future system behavior beyond the time-period describing the structure and predicting the behavior of an captured in training data. Most recently, the study of arbitrary dynamic system. In essence, a model of a dynamic Simidjievski et.al [10] focuses on the predictive performance system consists of two components: a structure and a set of of process-based models by using ensemble methods. parameters. There are two basic approaches to constructing However, while their proposed ensemble methods improve models of dynamic systems, i.e., theoretical (knowledge- the predictive performance of the process-based models, the driven) modeling and empirical (data-driven) modeling. In resulting ensemble model is not interpretable. the first, the model structure is derived by domain experts of In this paper we tackle the task of establishing an the system at hand, the parameters of which are calibrated interpretable predictive model of a dynamic system. We focus using measured data. In contrast, the later uses measured data on predicting the concentration of phytoplankton biomass in to find the most adequate structure-parameter combination aquatic ecosystems. Due to the high dynamicity and various that best fits the given task of modeling. In both approaches, seasonal exogenous influences [6, 7], most often process- models often take the form of ordinary differential equations based models of such systems are learned using short time- (ODEs), a widely accepted formalism for modeling dynamic periods of observed data (1 year at most). Note however, this systems, allowing the behavior of the system to be simulated short time-periods of data are very sparse, i.e., consist of very over time. few measured values, thus, most often the measurements are Equation discovery [1, 2] is the area of machine learning interpolated and daily samples are obtained from the dealing with developing methods for automated discovery of interpolation. quantitative laws, expressed in the form of equations, from The initial experiments to this end, indicate that the collections of measured data. The state-of-the-art equation predictive performance of such models is poor: While discovery paradigm, referred to as process-based modeling providing dense and accurate description of the observed [3], integrates both theoretical and empirical approaches to behavior, they fail at predicting future system behavior. To modeling dynamics. The result is a process-based model address this limitation we propose learning more robust (PBM) – an accurate and understandable representation of a process-based models. We conjecture that by increasing the dynamic systems. size of the learning data, more general process-base models The process-based modeling paradigm has already been will be obtained, thus yielding better predictive performance proven successful for modeling population dynamics in a while maintaining their interpretability. 97 Figure 1: Automated modeling with ProBMoT. The main contribution of this paper are the approaches to A process-based model consists of two basic types of handling the learning data. The intuitive way of increasing the elements: entities and processes. Entities correspond to the size of the learning data is by sequentially adding state of the system. They incorporate the variables and the predeceasing contiguous datasets, thus creating one long constants related to the components of the modeled system. time-period dataset, i.e., learning from sequential data (LSD). Each variable in the entity has its role. The role specifies In contrast, when learning from parallel data (LPD), the whether the variable is exogenous or endogenous. Exogenous model is learned from all the datasets simultaneously. Figure variables are explanatory/input variables, used as forcing 2 depicts the both approaches. The two approaches, in terms terms of the dynamics of the observed system (and are not of learning process-based models, are described in more modeled within the system). Endogenous variables, are the detail in Section 3. response/output (system) variables. They represent the We test the utility of the two approaches on a series of internal state of the system and are the ones being modeled. tasks of modeling phytoplankton concentration in Lake The entities are involved in complex interactions represented Zürich. We use eight yearly datasets, using six for training, by the processes. The processes include specifications of the one for validation and one for testing the predictive entities that interact, how those entities interact (equations), performance of the obtained models. The aim of this paper is and additional sub-processes. two-fold: besides validating the performance of the two From the qualitative perspective, the unity of entity and approaches to handling data when learning predictive processes allows for conceptual interpretation of the modeled process-based models, we also test the quality of the training system. On the other hand, the entities and the processes data. For that purpose, we perform additional set of provide further modeling details that allow for transformation experiments, similar to the previous. However, instead of from conceptual model to equations and therefore simulation using the interpolated data for learning the models – we use of the system, i.e., providing the quantitative abilities of the the original (sparse) measured values, thus examining the process-based model. The equations define the interactions influence of the interpolation on the predictive performance represented by the processes including the variables and of the process-based models. constants from the entities involved. The next section provides more details of the task of The process-based modeling paradigm allows for high- process-based modeling, and introduces a recent contribution level representation of domain-specific modeling knowledge. to the area of automated process-based modeling, i.e., the Such knowledge is embodied in a library of entity and process ProBMoT [4, 10] platform. Section 3 depicts the task of templates, which represent generalized modeling blueprints. predictive process-based modeling of aquatic ecosystems. The entity and process templates are further instantiated in Section 4 describes the data used in the experiments, the specific entities and processes that correspond to the design of the experiments, and the task specification. Section components and the interactions of the modeled system. 5 presents the results of the experiments. Finally, Section 6 These specific model components and interactions define the discusses the findings of this paper and suggests directions set of candidate model structures. for future work. The algorithm for inducing models employs knowledge- based methods to enumerate all candidate structures. For each 2 PROCESS-BASED MODELING AND PROBMOT obtained structure, a parameter estimation is performed using the available training data. For this reason each structure is The process-based modeling paradigm, addresses the task of compiled into a system of differential and algebraic learning process-based models of dynamic systems from two equations, which allows for the model to be simulated. In points of view: qualitative and quantitative. The first, essence, this includes minimizing the discrepancy between provides a conceptual representation of the structure of the the values of the simulated behavior obtained using the model modeled system. Still, this depiction does not provide enough and the observed behavior of the system. details that would allow for simulation of the system’s Recent implementations of the PBM approach include behavior. In contrast, the later, treats the process-based model Lagrame2.0 [11], HIPM [12] and ProBMoT (Process-Based as a set of differential and/or algebraic equations which Modeling Tool) [4, 10], which is next described. allows for simulation. The Process-Based Modeling Tool (ProBMoT), is a software platform for simulating, parameter fitting and 98 inducing process-based models. Figure 1 illustrates the One the other hand, when learning from parallel data, process of automated modeling with ProBMoT. The first depicted in Figure 2b, ProBMoT takes as an input several input to ProBMoT is a conceptual model of the modeled short time-period training datasets. The parameter system. The conceptual model specifies the expected logical optimization algorithm handles the short time-periods in structure of the modeled system in terms of entities and parallel, i.e., it estimates the optimal model parameters by processes that we observe in the system at hand. The second minimizing the discrepancy between the simulated behavior input is the library of domain-specific modeling knowledge. and each individual training set. By combining the conceptual model with the library of ProBMoT offers wide range of objective functions for plausible modeling choices, candidate model structures are measuring model performance such as sum of squared errors obtained. (SSE) between the simulated values and observed data, mean The model parameters for each structure are estimated squared error (MSE), root mean squared error (RMSE), using the available training data (third input to ProBMoT). relative root mean squared error (ReRMSE), which is used in The parameter optimization method is based on meta- all experiments presented here for when learning the models heuristic optimization framework jMetal 4.5 [13], in and evaluating their performance. Relative root mean squared particular, ProBMoT implements the Differential Evolution error ( ReRMSE) [16] is defined as: (DE) [14] optimization algorithm. For the purpose of ∑௡௧ୀ଴ሺݕ simulation, each model is transformed to a system of ODEs, ܴܴ݁ܯܵܧሺ݉ሻ ൌ ඨ ௧ െ ݕො௧ሻଶ (1) ௡ , which are solved using CVODE ODE solver from the ∑௧ୀ ሺ ଴ ݕത െ ݕො௧ሻଶ SUNDIALS suite [15]. where ݊ denotes the number of measurements in the test data Finally, the last input, is a separate validation dataset. In set, ݕො௧ and ݕ௧ correspond to the measured and predicted value both cases (LSD and LPD), the model which has best (obtained by simulating the model ݉) of the system variable performance on the validation dataset is the output of ݕ at time point ݐ, and ݕത denotes the mean value of the system automated modeling process. variable ݕ in the test data set. The data on the aquatic systems are very sparse (e.g. 3 PREDICTIVE PROCESS-BASED MODELING OF measure on a monthly basis). In the above mentioned studies, AQUATIC ECOSYSTEMS often they have been typically interpolated and sampled at a ProBMoT has been used extensively to model aquatic daily interval. Here, to assess the effect of the interpolation to ecosystems [4, 5 and 6]. Most of the case-studies, however, the performance of the models, we also consider using only have focused on descriptive modeling – focusing on the the original measured values when establishing the predictive content/interpretation of the learned models and not on their process-based model. accuracy and predictive performance (with the exception of [10]). Predominately, models have been learned from short time-period (one-year) datasets, as considered long time- periods worth of data resulted in models of poor fit. These models, however, had poor predictive power when applied to new (unseen) data. We use ProBMoT to learn predictive models of aquatic ecosystems from long time-period (multi-year) datasets. ProBMoT supports predictive modeling, as the obtained models can be applied/evaluated on a testing dataset. Taking the input/exogenous variable values from the test dataset, ProBMoT simulates the model at hand, and makes predictions for the values of the output/endogenous (system) variables. Using the output specifications, the values of the Figure 2: Two approaches to predictive modeling. a) output variables of the model are calculated and compared to Learning from sequential data (LSD), and b) Learning from the output variables from the test set, thus allowing for the parallel data (LPD). predictive performance of the model to be assessed. Concerning the use of long time-period datasets, 4 EXPERIMENTAL SETUP ProBMoT supports two different approaches, i.e., learning from sequential data (LSD) and learning from parallel data In this study, we apply the automated modeling tool (LPD). The parameter optimization algorithm uses the ProBMoT to the task of predictive modeling of available training data from the observed system to estimate phytoplankton dynamics in Lake Zürich, Switzerland. We the numerical values of the parameters. When learning from empirically evaluate the two different approaches for learning sequential data, illustrated in Figure 2a, ProBMoT takes as an predictive models, LSD and LPD, on this task. We apply input one training dataset. The training dataset is comprised those two on interpolated data (sampled daily) and on the of several contiguous short time-period datasets, thus the original (sparse) data. parameters are estimated over the whole time-span. 99 4.1 Data & domain knowledge dataset (year 1998) and the test dataset (year 1999) remain the The datasets used for our experiments were obtained from the same. Water Supply Authority of Zürich. Lake Zürich is a lake in Switzerland, extending southeast of the city 5 RESULTS of Zürich. It has an average depth of 49 m, a volume of 3.9 Table 1 summarizes the performance comparison between km3 and a surface area of 88.66 km2. The measurements models learned from sequential data (LSD) and models consist of physical, chemical and biological data for the learned from parallel data (LPD), using both interpolated period from 1992 to 1999, taken once a month at 19 different (left-hand side) and original (right-hand side) training data. sites, and averaged to the respective epilimnion (upper ten Note that, in both cases, learning from sequential data, yields meters) and hypilimnion (bottom ten meters) depths. better predictive performance than learning from parallel The data were interpolated with a cubic spline algorithm data. The results of the Wilcoxon test (in Table 1 below) and daily samples were taken from the interpolation [17]. shows that using LSD is better than using LPD, however, the Both the original and interpolated data from the first six years difference in performance is not substantial nor significant (p- were used for training the models (1992-1997), data from value=0.11). year 1998 for validation and data from 1999 to estimate the predictive performance of the learned models. Table 1: Comparison of the predictive performances The population dynamics model considered, consists of (ReRMSE on test data) of models learned from sequential one endogenous/output (system) variable and multiple data (LSD) and models learned from parallel data (LPD), exogenous/input variables structured within a single ODE. from both interpolated and original samples. The numbers in The phytoplankton biomass is represented as a system bold indicate the best result for the given years. variable, while the exogenous variables include: the Train data Interpolated Original concentration of zooplankton, dissolved inorganic nutrients (years) LSD LPD LSD LPD (nitrogen, phosphorus, and silica) and two environmental influences of water temperature and global solar radiation ‘97 1.398 1.398 1.074 1.074 ’96-’97 1.099 1.391 1.381 1.469 (light). ’95-’97 1.006 1.044 0.984 1.084 The library for process-based modeling of aquatic ’94-’97 0.986 1.094 1.004 1.112 ecosystems used in our experiments, is the one presented by ’93-’97 1.075 1.109 1.105 1.085 Atanasova [18]. Particularly, to reduce the computational ’92-’97 0.934 0.998 1.074 0.974 complexity of our experiments, we use a simplified version LSD > LPD; LSD > LPD; of the library which results in total of 128 candidate models. Wilcoxon test p-value = 0.11 p-value = 0.11 4.2 ProBMoT parameter settings Next, as shown in Table 1, using the original measured values For the parameter calibration procedure we use Differential when learning the models, did not improve their predictive Evolution with rand/1/bin strategy, 1000 evaluations over a performance. population space of 50 individuals. For simulating the ODEs Finally, most importantly, from both experiments we use the CVODE simulator with absolute and relative performed, we can conclude that using large amounts of tolerances set to 10ିଷ. For measuring the model performance training data (even interpolated) improves the overall we use objective function ReRMSE, described in Section 3. predictive performance of the learned process-based models. To further assess the significance of the differences in Note however, that for one case (’93-97) the performance of performance between the single dataset approach and the models does not improve. Further investigations are multiple datasets approach we use Wilcoxon test for required to determine whether this phenomena is due to the statistical significance [19] as presented by Demšar [20]. quality of the data of that particular dataset (‘93), or to the 4.3 Experimental design dynamics of the system at that particular period significantly In this paper we compare the performance of the two different differing from the rest. approaches (LSD and LSP) to learning predictive process- based models. For each approach we learn six process-based 6 CONCLUSION models using the available training data of six successive In this paper, we tackle the task of learning predictive years (1992-1997). For both cases, we start with one short interpretable process-based models of dynamic systems. In time-period training dataset (year 1997), and continue for five the process of establishing general and robust predictive steps adding one preceding year to the training data set. At models, we investigate learning from parallel data (LPD), in each step we learn the process-based models accordingly to contrast to the state-of-the-art approach of learning from the two approaches described in the previous section. sequential data (LSD). We apply the both approaches to the First, we apply this two approaches on the interpolated task of modeling phytoplankton dynamics in Lake Zürich, data, or more precisely, daily samples of interpolated data. using ProBMoT, a platform for simulating, parameter fitting Second, we apply the two learning approaches to the original and inducing process-based models. Additionally, besides (sparse) training data. In all of the experiments the validation validating the performance of the approaches to learning 100 predictive process-based models, we also test the quality of References: the training data by learning models from the original [1] P. W. Langley, H. A. Simon, G. Bradshaw, J. M. Zytkow. Scientific measured values, in contrast to learning models from daily Discovery: Computational Explorations of the Creative Processes. samples of interpolated data. MA: The MIT Press, Cambridge, MA, USA. 1987. The general conclusion of this paper is that using larger [2] S. Džeroski, L. Todorovski. Learning population dynamics models amounts of training data for learning process-based models from data and domain knowledge. Ecological Modelling 170, pp. 129– 140. 2003. yields improved predictive performance for tasks of modeling [3] W. Bridewell, P. W. Langley, L. Todorovski, S. Džeroski. Inductive aquatic ecosystems. Both, Atanasova et al [5] and Taškova et process modeling. Machine Learning 71, pp. 1–32. 2008. al. [6] clearly state that one-year datasets produce models [4] D. Čerepnalkoski, K. Taškova, L. Todorovski, N. Atanasova, S. Džeroski. The influence of parameter fitting methods on model with poor predictive performance. We show that using data structure selection in automated modeling of aquatic ecosystems. from a longer period, considered either consequently (LSD) Ecological Modelling 245 (0), pp. 136–165. 2012. or parallel (LPD) helps in deriving more general models, and [5] K. Taškova, J. Šilc, N. Atanasova, S. Džeroski. Parameter estimation therefore, better predictive models. in a nonlinear dynamic model of an aquatic ecosystem with meta- heuristic optimization. Ecological Modelling 226, pp. 36–61. 2012. Even though the statistical significance comparison shows [6] N. Atanasova, F. Recknagel, L. Todorovski, S. Džeroski, B. Kompare. that the LSD approach has better performance than the LPD Computational assemblage of Ordinary Differential Equations for approach, the difference in performance is neither substantial Chlorophyll-a using a lake process equation library and measured data nor significant. Nevertheless, when learning from sequential of Lake Kasumigaura. In: Recknagel, F.(Ed.), Ecological Informatics. data, due to the mater of simulation and parameter Springer, pp. 409–427. 2006a. [7] N. Atanasova, L. Todorovski, S. Džeroski, R. Remec, F. Recknagel, optimization, the available training data considered for B. Kompare. Automated modelling of a food web in Lake Bled using learning process-based models should be contiguous. On the measured data and a library of domain knowledge. Ecological other hand, one useful feature of the LPD approach is that can Modelling 194 (1-3), pp. 37–48. 2006c. [8] P. Whigham, F. Recknagel, F. Predicting Chlorophyll-a in freshwater handle missing data (e.g. intermediate period with no lakes by hybridising process-based models and genetic algorithms. measurements) for establishing robust process-based models. Ecological Modelling 146 (13), pp. 243–251. 2001. Our empirical evaluation of learning from the original [9] W. Bridewell, N. B. Asadi, P. Langley, L. Todorovski. Reducing uninterpolated and sampled interpolated data, showed that the overfitting in process model induction. In: Proceedings of the 22nd International Conference on Machine learning. (ICML ’05). ACM, pp. interpolation does not affect the performance of the learned 81–88. 2005. process-based models. On the contrary, the models learned [10] N. Simidjievski, L. Todorovski, S. Džeroski. Learning ensembles of using the interpolated values yielded better performance than population dynamics models and their application to modelling aquatic the ones learned using the original values. We conjecture that ecosystems. Ecological Modelling (In Press). 2014. this is due to the sparsity of the original measured values (~ [11] L. Todorovski, S. Džeroski. Integrating domain knowledge in equation discovery. In: Džeroski, S., Todorovski, L. (Eds.), Computational 12 time-points per year), which is insufficient to capture the Discovery of Scientific Knowledge. Vol. 4660 of Lecture Notes in dynamics of such a system. Moreover, considering the Computer Science. Springer Berlin, pp. 69–97. 2007. relative performance between the two approaches, the LSD [12] L. Todorovski, W. Bridewell, O. Shiran, P. W. Langley. Inducing hierarchical process models in dynamic domains. In: Proceedings of approach performed insignificantly better than the LPD the 20th National Conference on Artificial Intelligence. AAAI Press, approach Pittsburgh, USA, pp. 892–897. 2005. Taken all together, some new questions arise for further [13] J. J. Durillo, A. J. Nebro. jMetal: A Java framework for multi-objective investigation. How strongly the quality of measurements optimization. Advances in Engineering Software 42, pp. 760–771. 2011. affects the results? Would the results change significantly in [14] R. Storn, K. Price. Differential Evolution – A simple and efficient the case of ideal measurements? Considering this, possible heuristic for global optimization over continuous spaces. Journal of directions for further work are as follows. First, performing Global Optimization 11 (4), pp. 341–359. 1997. more experiments using multiple parallel sets of data from [15] S. D. Cohen, A. C. Hindmarsh. CVODE, a stiff/nonstiff ODE solver in different periods and, data from various different lake C. Computers in Physics 10 (2), pp. 138–143. Mar. 1996. [16] L. Breiman. Classification and Regression Trees. Chapman & Hall, ecosystems should be used. In order to achieve more London, UK. 1984. controlled experiments, we consider testing the presented [17] A. Dietzel, J. Mieleitner, S. Kardaetz, P. Reichert. Effects of changes approaches on synthetic data, that is, data obtained by in the driving forces on water quality and plankton dynamics in three simulating a well-established model of an arbitrary aquatic swiss lakes long-term simulations with BELAMO. Freshwater Biology 58 (1), pp. 10–35. 2013. ecosystem. Finally, we would like to extend our approach to [18] N. Atanasova., L. Todorovski, S. Džeroski, B. Kompare. Constructing different ecosystems and other domains. a library of domain knowledge for automated modelling of aquatic ecosystems. Ecological Modelling 194 (13), pp. 14–36. 2006b. [19] F. Wilcoxon. Individual comparisons by ranking methods. Biometrics, Acknowledgements 1:80–83. 1945. We would also like to acknowledge the support of the [20] J. Demšar. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, pp. 1–30. Dec. 2006. European Commission through the project MAESTRA - Learning from Massive, Incompletely annotated, and Structured Data (grant number ICT-2013-612944). 101 RECOGNITION OF BUMBLEBEE SPECIES BY THEIR BUZZING SOUND Mukhiddin Yusupov1, Mitja Luštrek2, Janez Grad, Matjaž Gams2 Czech Technical University in Prague – Computer Science Department1 Jozef Stefan Institute – Intelligent Systems Department2 e-mail: yusupmuk@fel.cvut.cz, {mitja.lustrek, matjaz.gams}@ijs.si, janez.grad@siol.com ABSTRACT Although there are generalizations of the type of problem we are solving here, such as a system for classifying many The goal of our work is to help people to automatically types of insects [4], relatively little has been done previously identify the species and worker/queen type of on automatic recognition on bumblebee species with a bumblebee based on their recorded buzzing. Many detailed analysis of their buzzing sound. Several internet recent studies of insect and bird classification based on applications provide sounds and images and images of their sound have been published, but there is no different species of birds, frogs etc., but they rely on human thorough study that deals with the complex nature of pattern-recognition skills to identify the species at hand and buzzing sound characteristic of bumblebees. In this do not provide active help. Our study is related to active paper, a database of recorded buzzings of eleven species system help in recognizing a particular (sub)species, and in were preprocessed and segmented into a series of sound particular to other audio data classification problems like samples. Then we applied J48, MLP and SVM classification of general audio content [8], auditory scene supervised classification algorithms on some recognition, music genre classification and also to the predetermined sets of feature vectors. For five species speech recognition, which have been studied relatively the recognition rate was above 80% and for other six extensively during last few years also in our department. We species it was above 60%. At the end we consider how to here try to take advantage from these previous studies. further improve the results. Some studies like [7], where they also tried to classify bee 1 INTRODUCTION species, used different approaches. We can view these systems as solving a pattern recognition problem. In [7] the Bumblebees are important pollinators of many plants and recognition of bee species is performed visually, based on their colonies are now used extensively in greenhouse its image. The task was to find relevant patterns from the pollination of crops such as tomatoes and strawberries. image and identify similarities to specific species. However, Some bumblebee species are declining and it is a cause for pictures vary a lot based on different factors, and often a concern in Europe, North America and Asia. In Europe, picture does not represent well what we see in nature. In our around one quarter of species are threatened with extinction, work the patterns are buzzing sound events produced by according to recent studies. This is due to a number of bumblebees. The chosen approach is recognition based on a factors, including land clearing and agricultural practices. parametric (feature) representation of the sound events. There is a lot of research devoted to keep some bumblebee Features should be selected so that they are able to species from such decline. maximally distinguish sounds that are produced by different bumblebee species. Most of the recognition systems based Until now over 250 species are known. There are about 19 on audio and especially human voice recognition uses Mel- different species of bumblebee found in the UK, 68 in frequency cepstrum coefficients (MFCC) as a feature vector. Europe, 124 species in China, 24 in South America and 35 There are also works where feature vectors are Linear in Slovenia. The colonies of bumblebees are composed of a Predictive Coding coefficients (LPC) or a set of low-level queen and many workers. Since only experts can identify signal parameters like in [1]. the species by looking at or listening to them and their sound, we decided to make this identification easy for all. This paper uses MFCC and LPC to extract the features. For One needs to record the buzzing and provide it to the system the extraction of features and for other processing of audio (program) that will process it and then tell to which species records we used jAudio package [9]. Before feature and worker/queen type this buzz corresponds to. The extraction we preprocess and segment the sound recordings. program is accessible from the homepage of the Department We found that the segmentation is as important as the of intelligent systems at the Jozef Stefan Institute - extraction of features with a strong influence on the http://dis.ijs.si/. More information can be provided from prediction accuracy. Then we constructed the model janez.grad@siol.com. separately with three different classification algorithms: J48, MLP and SVM. Training and evaluation of a model were 102 performed on a stored database of fifteen species of However, spectral changes of signal parts are rather diverse bumblebees. The experiments were carried out using and detection of boundaries of such samples is difficult WEKA open source machine learning software. Results because adjacent samples of separate buzzings can overlap show that SVM has better performance than other two in time and frequency. Moreover, due to the starting point of systems. buzz is being slow it may occur below the background noise level. In Figure 1 we can see the representation of the sound 2 PREPROCESSING record of humilis queen. It is difficult to recognize there are three separate relevant parts and everything in between with Each sound record preprocessing consists of three steps: low frequency components as not relevant. normalization, pre-emphasis and segmentation. First the normalization is applied to the record by dividing it with In Figure 2 it is even more problematic to say where exactly maximum value: buzzing of the sylvarum worker starts and only in 20% of ~ the recording there is the buzzing signal we are interested in. a = x ( i)/ max x ( i) , 0 i n− 1 (1) During the investigation of spectrum of each bumblebee species we also found out that buzzing of the same species x ( i) ~ x ( i) where is the original signal, is the normalized can vary based on the state of the bumblebee or during the signal and n is the length of the signal. buzzing of one species some other ones can join and as a result we will have a combination of buzzes. Same species After that pre-emphasis is performed in order to boost only makes one buzz when for example it is working and has the high-frequency components, while leaving the low- some other different buzz when it is angry. frequency components in their original state. This approach is based on observations that sound data comes with a high We have to take into account various factors in devising a frequency and low magnitude whereas the parts of the segmentation method, since unsuccessful separation of a recording that we are not interested in (noise, gaps) record would result in unsuitable candidate samples and incorporate low frequency and much higher energy. The subsequently parametric representation would be different pre-emphasis factor α is computed as than for real signal data. That is why for the current version of our work we decided to segment the audio recordings − 2 π F t α = e (2) manually by an audio editor program so that we could see the result of recognition based purely on real signal data. On where t is the sampling period of the sound. The new sound one hand, this decision of manual separation obliges us to is then computed as: use in a testing phase of the model only noisiness records H ( z)= 1− α z−1 where most parts of the record consists of signal data. But (3) on the other hand we analyzed how recognition accuracy The last step of the pre-processing is segmentation. In this changes when we change the strategy for segmentation, step we separate sound record into a number of samples since by visually looking at the spectrum of record it is which represent only the buzzing. Each sound record is 45 easier to segment it. In this current state of the work we to 60 seconds long. Extracting features from the whole segmented the recording manually into samples of 1-4 sound record, firstly, increases the computational seconds of length and the parts which have less than 1 complexity and, secondly, affects the accuracy of the seconds of buzzing duration we combined with adjacent recognition. We do not need to calculate the features for the samples. silent, noisy and other irrelevant parts of the record. 3 FEATURE EXTRACTION AND MODEL CONSTRUCTION For each sample segment we calculated MFCC and LPC features. These are features that are mostly used in audio based classification tasks. Samples are processed in a Figure 1: Representation of audio record of humilis queen window-by-window manner. The size and the overlapping species in time domain factor of windows are the two key parameters in feature extraction in general in any audio/signal processing task. We found that the window size of 2014 and the overlapping factor of 30% gives us the feature vectors, which subsequently resulted in the best recognition model. For each window we have several MFCC or LPC values. It is better to represent each window with one feature value by Figure 2: Representation of audio record of sylvarum aggregating all the values in a window, so we applied the worker species in time domain aggregation by computing the mean value for each window. 103 In this work we considered three classification algorithms [4] Diego F. Silva, Vinícius M. A. de Souza, Eamonn for building the model and these are the J48 Decision Tree, Keogh, Daniel P. W. Ellis. Applying Machine Learning MLP Neural Network and SVM algorithms. Models were and Audio Analysis Techniques to Insect Recognition built by these algorithms to classify among the 15 different in Intelligent Traps. bumblebee cases, most common in Slovenia, i.e. central [5] Ruben Gonzalez. Better than MFCC Audio Europe. Classifying between a worker and a queen is not Classification Features. difficult and on average 90% for all species are easily [6] J. F. Bueno, T. M. Francoy, V. L. Imperatriz-Fonseca, identified, but we are more interested in knowing the exact A. M. Saraiva. Modeling an automated system to type of species like hortoum, hypnorum and pratorum in identify and classify stingless bees using the wing addition to the status of a bumblebee in colony. So for each morphometry – A pattern recognition approach. species we have two cases, either queen or worker, [7] H. Fujisaki, S. Ohno. Analysis and Modeling of altogether 15 classes. The fact that the number of records for Fundemental Frequency Contour of English Utterances. each class of species in our testing and training dataset are Proc. EUROSPEECH’95. Vol. 2. pp. 634-637. not evenly distributed caused slight inconvenience for us to Philadelphia. 1996. build a good model. Also, this is one of the reasons why we [8] Erling Wold, Thom Blum, Douglas Keislar and James have different rates of accuracy for all species. 5 of the Wheaton. Content Based Classification, Search, and bumble species we recognized with above 80% of accuracy Retrieval of Audio. and 2 of them had a rate of 95%. In Table 1 we provide the [9] Daniel McEnnis, Cory McKay, Ichiro Fujinaga, rates of recognition for each model built separately on Philippe Depalle. jAudio: A feature extraction library. MFCC and LPC feature values with the three algorithms. LPC MFCC J48 56% 56% MLP 56% 60% SVM 57% 64% Table 1: Evaluation of the rates of recognition accuracy for each built model In practical terms, when the system proposed three most- probable classes, the accuracy rose to over 90% overall, enabling users to distinguish between the three proposed potential solutions visually. This is the way the system works at the moment. 4 CONCLUSION In future we want to make the segmentation step to separate record of samples automatically in a system by incorporating all we learned from recordings and patterns of the 11 bumblebee species of both types. Also, we are going to build model using HMM and deep learning, because in many works related to audio classification HMM and deep learning produce best results. Then we intend to compare its result with the ones we obtained from SVM. References: [1] Seppo Fagerlund. Bird Species Recognition Using Support Vector Machines. EURASIP Journal on Advances in Signal Processing, Volume 2007, Article ID 38637. doi:10.1155/2007/38637 [2] Zhu Leqing, Zhang Zhen. Insect sound recognition based on SBC and HMM. 2010 International Conference on Intelligent Computation Technology and Automation. [3] Forrest Briggs, Raviv Raich, and Xiaoli Z. Fern. Audio Classification of Bird Species: a Statistical Manifold Approach. 104 THE LAS VEGAS METHOD OF PARALLELIZATION Bogdan Zavalnij Institute of Mathematics and Informatics University of Pecs Ifjusag utja 6, H-7634 Pecs, Hungary e-mail: bogdan@ttk.pte.hu ABSTRACT 2 THE FAMILY OF MONTE CARLO METHODS The family of the Monte Carlo methods can be While the methods of parallelizing Monte Carlo algorithms in engineering modeling very popular, these distinguished by the nature of their error. We can speak methods are of little use in discrete optimization about two sided, one sided and zero sided error methods. In problems. We propose that the variance of the Monte the case of two sided errors we approximate the solution Carlo method, the Las Vegas method can be used for step by step. In the analysis of the method we can measure these problems. We would like to outline the basic the distance of the approximation from the real solution, concept and present the algorithm working on a specific and find that we can get closer and closer in each step. This problem of finding the maximum clique. method is mostly useful in engineering modeling of problems with real number solutions. This two sided 1 INTRODUCTION method can be programmed in parallel environment with The Monte Carlo methods have been powerful tools in ease, as the steps usually independent. scientific and engineering modeling for the last half century In the case of one sided error, which takes place mostly in [1]. Their usage become even more intense in the era of decision problems, in each step we either get a final computers. The easiness of parallelization made these solution, or get no answer. This is the idea behind many methods useful in supercomputer environments as well. But primality tests, where we can find the composite numbers, apart from the original idea the more recent versions of but get uncertain answer for primes. The algorithms make a these methods, in which category the Las Vegas method is few dozen steps, and the uncertainty of being wrong falling [2], found little usage in algorithms, and even less in decreases to minimum. These algorithms usually are very parallel programs. The few exceptions are the primality fast and need no parallelization. tests and the quicksort algorithm, although some research The last method, which is called the Las Vegas method, is was made earlier in this field [7][8]. the case of zero sided error. The famous quicksort algorithm The problem we concentrate on is the maximum clique falls into this category. With these algorithms we always get problem [4], although the concept described in this paper the right answer – as the quicksort sorts the sequence in the applies to other problems in the field of discrete end –, but the running time of the algorithm can be optimization as well. The maximum clique problem can be described by a probability variable. In other words formulated in the following way. Given a finite simple sometimes the algorithm is very fast, and sometimes it can graph G=(V,E), where V represents the nodes and E be very slow. (Luckily the later case is very-very rare in the represents the edges. We call Δ a clique of G if the set of case of the quicksort.) Formally, we call an algorithm a Las Vegas algorithm if for vertices of Δ is subset of V; Δ is an induced subgraph of G a given problem instance the algorithm terminates returning ; and Δ is an all connected graph, thus all its nodes a solution, and this solution is guaranteed to be a correct connected to all the other nodes. We call Δ a maximum solution; and for any given problem instance, the run-time clique if no other clique of G is bigger than it. The of the algorithm applied to this problem is a random maximum clique problem is to find the size of a maximum variable. [13] clique, and it is a well known NP-hard problem. A From this description it is clear, that the Monte Carlo simplified variation of this problem is the k- clique problem, method, on one hand, can be easily used for engineering which is a problem of the NP-complete class. The question problems as we are looking for real number answers with is, that if given a graph G, and a positive integer k, is there a certain correctness. On the other hand, in the case of clique of size k in the graph. To answer the question we discrete and combinatorial optimization problems we either must present a k-clique of the graph, or either prove usually need exact answers, so the Las Vegas method can that there is none in the graph. prove itself of more use. 105 3 PARALLELIZATION WITH THE AIM OF THE can help other instances to solve their subproblems faster. LAS VEGAS METHOD This method resembles the BlackBoard technique known The variance in the running time of a Las Vegas algorithm well in the field of Artificial Intelligence. led Truchet, Richoux and Codognet to implement an This approach can be used to parallelize several different interesting way of parallelization the algorithms for some discrete optimization algorithms. Namely, we can use it in NP-complete discrete optimization problems [13]. The any Branch-and-Bound technique instead of the branching authors note that the algorithm implementation for those rule. As it happens at a branching we have the problem of problems heavily depends on the "starting point" of the choosing the sequence of the branches. The speed of the algorithm, as it starts from a random incorrect solution and algorithm heavily depends on this sequence, as the result in constantly changes it to find a real solution. Depending on one branch may help us in an other branch – as a new, the starting incorrect solution the convergence of the better bound for example. algorithm may be very fast or slow as well. The idea behind the Las Vegas parallel algorithm was to start several 4 AN APPLICATION instances of the sequential algorithm from different starting In order to demonstrate the described method we choose a points and let them run independently. The first instance more simple algorithm than a general Branch-and-Bound. which finds the solution shuts down all the other instances Instead we used an algorithm from Sandor Szabo [10], and the parallel algorithm terminates. As the running time which answers the k-clique problem by dividing the original of the different instances vary, some will terminate faster, problem into thousands of subproblems. These subproblems thus ending the procedure in shorter time. The article then can be processed parallely with a sequential program. describes the connection of the variance of the running Obviously this algorithm needs proper number of times and the possible speed-up when using k instances and subproblems in order to achieve proper speed-ups, which found that for some problems a linear speedup could be this algorithms achieves well. The proposal starts with a achieved. quasi coloring with k-1 colors, and then examines each This approach can be useful in several ways. For example disturbing edge, whether that edge can be an edge of a k- one can use different solvers for a given problem, and/or clique. If yes, than we found a positive solution, if no, then use different preconditioning techniques. Starting these the edge can be deleted from the graph. After all the solvers concurrently will lead the most suitable one to finish disturbing edges are deleted, we get a proper k-1 coloring, in the shortest time, thus leading us to a fast solution. (Note, which forbids the k-clique, thus we solved the problem. I that for different problems different versions of the solvers have implemented this proposal and measured the running may be the fastest.) times for several different problems [14]. While the previous approach is simple and extremely The measurements compares three version of the algorithm. elegant as well, it lacks something. First, the different In the first there is no information given from one instances cannot help each other in finding the solution. subproblem to another to help it in the solution. The Second, each of the instances trying to solve the whole program instances run totally independently. In the second I problem and no division into subproblems appears in this constructed a sequence where the helping information is the proposal. consequence of this sequential ordering, thus the help given I propose a different approach, which includes these in advance. This means that we can delete the edges in the notions. If we divide an NP-hard problem into parts, then sequence of the subproblems in advance, proposing that no the arising subproblems falls into the same category as k-clique can contain them. There is no actual described: these are also NP-hard problems, and have great communication between the program instances and they variance in solution time. But we have a problem of also run independently. These two versions are detailed in constructing the sequence of the divided subproblems. As a the paper of Sandor Szabo [10]. The third version is the Las solution of one subproblem can be helpful in the solution Vegas method, where the program instances starts parallely, for the other the sequence of these subproblems have great and when one is finished, this information is given to others importance: we would like to solve the easier first to help thus speeding up their solution time as the subproblems can the more complex ones later. Here we can use some be reduced with the aim of this information. In our case if heuristics, but more often we proceed in the order the the algorithm for a given subproblem reports that there is no subproblems are already given, which leads to an inefficient k-clique that contains that edge, then we delete this algorithm. particular edge from all the subproblems including those Instead we can use the proposed Las Vegas technique that are already running. For this purpose we obviously starting the instances of the solver for the arisen need a sequential clique search program where an edge can subproblems parallely. Thus we may overcome the question be deleted during the runtime. of the sequence construction. As we seen, some problems will run much faster and terminate with the desired answer. 5 RESULTS These answers the can be feed to the other instances and I used three sets of graph problems. The first set is consists help them to solve their subproblem faster. This way each of random graph with given probability of the edges. The instance solving a partial problem instead of the whole, and second set is taken from the DIMACS challenge website 106 [5][6]. In the third set two extremely hard problem Table 2: Problems from the DIMACS challenge represented, one is from coding theory [3][12], the other is brock latin_square keller5 MANN p_hat p_hat from combinatorial optimization [11]. For all problem we 800_3 _10 _a45 1500-1 500-3 know the clique size, so I run the algorithm to prove that N 800 900 776 1035 1500 500 there is no clique bigger by one than the known clique % 65 76 75 99 25 75 number. This step is important, because finding the clique 25 90 27 345 12 50 maximum clique depends only on luck, thus shows little parts 4888 380 420 45 14918 657 about the goodness of an algorithm. While for proving that seq 7302 4902 4531 3666 278 * there is no clique which is bigger by one as the known one 5-nopt * * * 1340 894 * needs extended search through the whole search tree, and 5-opt * 1423 * 719 814 * thus provides a good comparison for different algorithms 5-lv * 1504 * 1051 824 * and implementations. 16-nopt * 531 986 402 247 * The tables show the name of a problem instance, the size 16-opt * 403 672 205 225 * (N), the density (%), the maximum clique size (clique) and 16-lv * 430 686 388 232 * the running time for the sequential algorithm on the same 64-nopt 472 150 318 183 60 * computer (seq). I also noted the number of subproblems that 64-opt 413 105 173 140 54 11k arise in the algorithm (parts). 64-lv 425 128 228 174 56 7165 The tests were run on 4+1, 16, 64 and up to 512 processor 512-nopt 64 82 138 183 9 41k cores (one core doing the distribution and not taking part in 512-opt 55 62 137 140 8 11k calculation itself), and I show the running times in seconds 512-lv 59 83 138 183 8 5300 for those core numbers. The “noopt” results are from the first version with no help between the problems, the “opt” Table 3: Problems of monotonic matrices and deletion stands for the more optimal version with help from one codes instance to an other by the original sequence, and the “lv” represents the Las Vegas method of parallelization. If the monoton-7 monoton-8 monoton-9 deletion-9 running time exceeded the time limit the table continues no N 343 512 729 512 data (*). The produced results seems to prove the idea % 79 82 84 93 interesting. The running times of the third algorithm in most clique 19 23 28 52 of the cases were close to the running times of the second parts 313 590 932 375 algorithm with help to other subproblems, which is an interesting fact by itself. But even more interesting, that for seq 7 2347 * * some cases it surpassed the second algorithm. These were 5-nopt 76 * - - the most difficult cases, thus this method could perhaps be 5-opt 74 1282 - - useful for the solution of the most difficult problems. 5-lv 74 1292 - - Table 1: Random graph problems 16-nopt 23 959 - - 16-opt 21 408 - - N 200 300 500 500 500 1000 1000 1000 16-lv 22 385 - - % 90 80 60 70 80 40 50 60 64-nopt 8 475 150k - clique 40 29 17 22 32 12 15 20 64-opt 6 409 150k - parts 152 540 2478 2231 1664 10918 10955 9823 64-lv 6 195 44k - seq 623 898 67 3453 * 136 447 15k 512-nopt 4 405 150k * 5-nopt 376 466 431 * * 1268 * * 512-opt 2 409 150k * 5-opt 109 231 420 1401 * 1156 * * 512-lv 4 243 31k 255k 5-lv 126 242 424 1444 * 1168 * * ACKNOWLEDGEMENTS 16-nopt 123 135 119 584 * 350 * * Author would like to thank the HPC Europe grant for the 16-opt 33 64 116 387 * 319 * * fruitful visit to Helsinki, to the Finish Computer Science 16-lv 66 71 118 407 * 329 * * Center which hosts the supercomputer Sisu on which the 64-nopt 49 45 29 142 * 84 368 1236 computations was performed. 64-opt 27 16 28 93 18k 76 345 1064 64-lv 39 23 29 99 21k 79 355 1100 512-nopt 38 23 4 25 14k 11 48 158 512-opt 27 15 4 14 6595 10 45 135 512-lv 39 23 4 19 4189 10 46 142 107 Refernces [8] Luby, M., Sinclair, A. and Zuckerman, D. Optimal Speedup of Las Vegas Algorithms. In: Proceedings of [1] Hammersley, J.M. and Handscomb, D.C. Monte Carlo the 2nd Israel Symposium on Theory of Computing and Methods. London. 1975. (1964.) Systems. Jerusalem, Israel, June 1993. [2] Babai, L. Monte-Carlo algorithms in graph isomorphism [9] Ostergaard, P.R.J. A fast algorithm for the maximum testing. Université de Montréal, D.M.S. No.79–10. clique problem. Discrete Applied Mathematics. (2002), http://people.cs.uchicago.edu/~laci/lasvegas79.pdf 197–207. [3] Bogdanova, G.T., Brouwer, A.E., Kapralov, S.N. and [10] Szabo, S. Parallel algorithms for finding cliques in a Ostergaard, P.R.J. Error-Correcting Codes over an graph. Journal of Physics: Conference Series Volume Alphabet of Four Elements. Designs, Codes and Cryp- 268, Number 1. tography. 2011 J. Phys.: Conf. Ser. 268 012030 August 2001, Volume 23, Issue 3, pp 333– doi:10.1088/1742-6596/268/1/012030 342. [11] Szabo, S. Monotonic matrices and clique search in [4] Bomze, I.M., Budinich, M., Pardalos, P.M. and Pelillo, graphs. Annales Univ. Sci. Budapest., Sect. Comp. M. (1999). The Maximum Clique Problem. In D.-Z. Du (2013), 307–322. and P.M. Pardalos (Eds.) Handbook of Combinatorial Optimization. [12] Sloan, N. http://neilsloane.com/doc/graphs.html (May (pp. 1–74.) Kluwer Academic Publishers. 30, 2014) [5] DIMACS. ftp://dimacs.rutgers.edu/pub/challenge/graph/ [13] Truchet, C., et al. Prediction of Parallel Speed-ups for (May 30, 2014) Las Vegas Algorithms. http://arxiv.org/abs/1212.4287 [6] Hasselberg, J., Pardalos, P.M. and Vairaktarakis, G. 2012. Test case generators and computational results for the [14] Zavalnij, B. Three Versions of Clique Search Paral- maximum clique problem. Journal of Global Optimiza- tion. lelization. Journal of Computer Science and (1993), 463–482. Information Technology. June 2014, Vol. 2, No. 2, pp. [7] Luby, M. and Ertel, W. Optimal Parallelization of Las 09–20. Vegas Algorithms. In Enjalbert, P at all. (Eds.), Lecture Notes in Computer Science. (1994). (pp. 461–474.) Springer Berlin Heidelberg. 108 RESOURCE-DEMAND MANAGEMENT IN A SMART CITY Jernej Zupančič1, Damjan Kužnar1, Boštjan Kaluža1, Matjaž Gams1 1Department of Intelligent Systems, Jožef Stefan Institute, Jamova cesta 39, 1000 Ljubljana e-mail: jernej.zupancic@ijs.si ABSTRACT the case. Truth-incentive demand management was proposed in [4]. The mechanism described in [4] has several desirable Due to the increasing demand and limited amount of nat- properties besides being truth-incentive: it is proved to con- ural resources the costs of resource supply is increasing. verge to a Nash equilibrium, it is budget balanced (the total Reasons for the increasing demand are also the increas- cost of resource provision equals the total cost the consumers ing needs of urban life in the city and the lack of mech- pay), and it reaches the global optimum (minimal peak-to- anism that would encourage proper resource consump- average ratio). However, it still raises some issues: the re- tion. In the paper we present a hierarchical architecture source consumption of every consumer is known to everyone, for resource-demand management in Smart city. The pro- consumers are only encouraged to shift their load to different posed architecture enables distributed computing and ro- parts of the day and not to reduce consumption, smaller con- bust resource-demand management. Further, we present sumption does not result in smaller per-unit price, and real- a two-stage mechanism that encourages the reduction of time pricing requires price prediction capabilities. resource consumption. In first stage it ensures that all the In this paper we give a short presentation of a mechanism consumers are satisfied with the resource per-unit price [7] that addresses some of those shortcomings. It changes and in the second stage rewards are offered to consumers the prices dynamically and adapt them to each consumer in- who are prepared to further lower their consumption. dividually, it is truth-incentive, it encourages lower resource consumption and it preserves privacy of the consumer. Fur- 1 INTRODUCTION ther, we present the architecture that enables the application of the proposed mechanism. The demand of resources such as electricity, water, natural gas The rest of the paper is structured as follows. In section and oil is on the rise. Together with limited amount of natural 2 we present the envisaged architecture for resource-demand resources they contribute to the increasing costs of resource management in Smart city. In section 3 we give general de- supply [1]. Consumers usually pay the same per-unit price scription of demand-management mechanisms and in section for a resource, although large consumers are responsible for 4 the negotiation protocol that encourages consumption re- the rising cost. With the prevalence of metering technology duction of convexly priced resources is presented. Section 5 and increasing sustainability awareness of consumers differ- summarizes and concludes the paper. ent levels of consumption are urged to be priced differently. Using proper mechanism one could lower resource de- 2 ARCHITECTURE FOR RESOURCE-DEMAND mand peaks that are usually expensive (additional less effi- MANAGEMENT IN SMART CITY cient means of resource production have to be enabled) by shifting the consumption of the resource to a part of the day Every city is hierarchically structured into districts then fur- when the resource is in low demand. This way resource can ther into streets and then even further into individual build- be used much more efficiently and investment into resource ings with devices and appliances. Therefore, it is only natu- production units that are required to meet peak demands can ral that resource-demand management architecture acknowl- be postponed. edges this hierarchical structure. Although hierarchical struc- A resource-demand management mechanism that sets the ture has some disadvantages against star formation (Figure 1) same price for every consumer in advance does not fully that is typically used for resource-demand management (hi- exploit demand-management capabilities. Dynamic-pricing erarchical structure requires more communication nodes and mechanisms (the prices change every so often) were already more messages to be transmitted, which could result in slower proposed (for instance in [6]). Most of such mechanisms response) it also possesses some desirable properties: it is dis- require the consumers either to report their utility function tributed, it is more resistant against failures and it can better (which raises privacy concerns) or to leave the decision about represent the reality and real world decision making. consumption to the consumers themselves (which is not as Decision nodes in the architecture have to communicate efficient as it could be). A dynamic-pricing mechanism that among themselves, they need some computational ability and negotiates prices with every individual was proposed in [3]. they have to take into account real-world decision maker pref- However, their approach has to assume that the consumers erences when taking decision by themselves. They are agents will report their consumption truthfully, which is not always forming a multi-agent system. Hierarchical organizational 109 House1 Street House12 6 Street5 House5 House2 Street1 District1 House District 11 2 House6 House4 City House3 House10 Street2 City House4 Street3 Street4 House1 House3 House5 House15 House6 House2 House House 7 House 13 House14 8 House9 Figure 1: Hierarchical structure on the right and star formation on the left structure of multi-agent system can be described as follows. All mentioned mechanism properties are preferred in a At the root of the structure is the top decision entity in the mechanism that is to be implemented in reality. city and it is responsible for setting the cost of resource provi- Mechanisms can be further divided according to the length sion and global price computation of the resource. We call the of the time period for which they determine consumption. Re- root node a Resource negotiator agent. At lower levels there source consumption over the day can be divided into finitely are Aggregation agents that propagate the price information many time intervals. A dynamic mechanism can determine from higher nodes to lower nodes and resource consumption the resource consumption for the next short time interval (on information from lower to higher nodes. They could be in- the scale of an hour), in that case the price and consump- dependent since every district could have its own policies on tion are set dynamically every time a new time interval ap- resource consumption. At the lowest level there are House proaches. This type of mechanism is used when it is difficult coordinator agents that can operate a group of appliances and to determine the cost of resource production and distribution devices. Every parent node can either negotiate with its child far in advance. With the dynamic mechanism the agents can- node/agent (when child agent does not reveal its information) not schedule the operation of the devices or appliances for the or optimize it (when child agent reveals its information and whole day. The resource demand dynamically matches the allows the parent agent to control it). price of the resource. A scheduling mechanism can determine the price and the 3 RESOURCE PRICING MECHANISM resource consumption for every time interval for the whole day. Appliances operation is scheduled in a way that opti- In previous section we presented an underlying architecture mizes the cost or energy efficiency and meets the require- that defines the network of nodes and connections between ments of the user. When using a scheduling mechanism a them. In this section we will give some general information resource negotiator needs a good knowledge of the resource about mechanisms or protocols. Mechanisms define how the provision cost for every time interval for the whole day. Un- communication is carried out over the proposed architecture, expected events can greatly disturb the schedule set by the what is the content of communication and what are the rules agents. for the interaction between the nodes. We can classify mech- anisms according to their properties. 4 A DYNAMIC NEGOTIATION MECHANISM FOR A mechanism is strategy proof or truthful when the best CONVEXLY PRICED RESOURCES the agents can do when the mechanism is in action, is to tell the truth. There is no incentive for any of the agent to lie about In this section we give a quick overview of a negotiation its information. mechanism that encourages the reduction of resource con- A mechanism converges to some final distribution of sumption of convexly priced resource. A more detailed de- prices and consumptions in finite number of steps. Conver- scription can be found in [7]. The mechanism consists of gence property guarantees that the mechanism ends, however, two stages: a negotiation stage and a renunciation stage. In the speed of convergence is important as well and should be the first stage the goal is to reduce resource consumption to a addressed. level where every consumer is satisfied with the price it has A mechanism is budget-balanced when the cost of re- to pay. In second stage rewards are offered to consumers that source provision to the consumers should be the same as the are prepared to further lower their consumption. total cost the consumers pay for the consumed resource. No In negotiation stage many rounds take place with con- agent can benefit from the mechanism, be it a resource pro- sumers reporting their desired consumption to the negotia- ducer or resource consumer. tor and negotiator computing prices for every consumer. The 110 prices are computed individually for every consumer using se- The protocol is strategy-proof when a consumer is risk- rial cost sharing mechanism [5]. The serial cost sharing mech- avert. A consumer could insist on a large desired consump- anism determines fair price for every consumer – lower con- tion in the negotiation stage and then try to obtain large reward sumption results in lower price. Further, since the resource is when reducing the resource consumption in the second stage. convexly priced it also results in lower per-unit pricing. However, rewards are first given to the small consumers and Individual price is computed by a function price(i, f, c) those rewards are also the largest, since the pricing of the re- that takes as inputs the following parameters: consumer i, source is convex. Further, the rewards are adjusted so that the a sorted consumption vector c and resource cost function f . consumer with a higher demand after the negotiation stage Function price is defined recursively in Eq. (1). cannot obtain a lower per-unit price in the renunciation stage. Therefore, by lying an agent risks getting a price it is not pre- price(0, f, c) = 0 pared to pay. Truth-telling is therefore the best strategy for a price(i, f, c) = price(i − 1, f, c)+ (1) risk-averse agent. Consumers are motivated to reduce resource consump- f i−1 c[j] + (N + 1 − i) · c[i] − f i−1 c[j] j=1 j=1 tion by the protocol itself. Due to serial cost sharing mecha- + , N + 1 − i nism, which is used for pricing in the negotiation stage, lower demand results in a lower resource price. Further, since the where N is the number of consumers participating in negoti- resource is convexly priced, per-unit price is reduced as well. ation. The protocol is budget-balanced. Serial cost sharing If any of the consumers does not agree with the price it mechanism defines prices in a way that the cost of resource receives and wants to further reduce its consumption antic- supply equals to the total cost the consumers have to pay. In ipating a lower per-unit price, another round of negotiation the renunciation stage, when a consumer reduces its resource takes place. In the following round the desired consumption consumption, a budget imbalance (a surplus) occurs in the fa- of individual consumer can be the same or lower. Negotiation vor of the negotiator (seller of the resource), since larger con- stage ends when the demand is the same for every consumer sumers pay a higher price than needed to obtain the desired in two consecutive rounds. amount of the resource. The surplus is then offered to the In renunciation stage the consumers that further reduce consumer who reduced its consumption and sometimes even their consumption are rewarded under the condition that the to the consumers with lower consumption, due to the per-unit reward they are offered is sufficient to compensate their con- price equalization. Therefore, no surplus is generated at the sumption reduction. There is only one round in renunciation end of the renunciation stage. stage. Consumers are addressed one by one starting with the consumer with the lowest demand. Algorithm for the renun- ciation stage outputs new prices while taking into account fur- 4.2 Experiments ther reductions offered by consumers and rewards demanded. Negotiator computes the reduction in the resource total supply We tested the negotiation mechanism on the multi-agent sys- cost (the cost is lowered due to lower demand), which could tem implemented in JADE (Java Agent DEvelopment) frame- be offered as a reward to a consumer. To ensure that the con- work [2]. We implemented a resource negotiator agent and sumers who had lower resource demand after the negotiation house representative agents in star formation. Each house stage receive lower final per-unit price we may have to adjust representative agent possesses the information about the elec- the reward. If the consumer agrees with the reward it lowers trical energy consumption of the devices and the information its consumption and receives a discount. about consumer preferences (maximal per-unit price for op- erating each device and required reward for not operating the device). This information is private to the house representa- 4.1 Mechanism properties tive agent and sent neither to resource negotiator nor to other The presented mechanism has several desirable properties. In house representative agents. Linear piecewise function was this section we will list them and present the sketches of the chosen for convex resource cost function. proofs. In typical simulation run reduction of energy consump- Negotiations converge in a finite number of steps. The tion level can be observed (Figure 2). In the first round of renunciation stage ends in one round, therefore, we only have the negotiation (up to the dashed line), every house represen- to show that the negotiation stage of the protocol ends in a tative achieves the price it is willing to pay. A further re- finite number of rounds. Since there is a finite number of con- duction in resource consumption occurs in the renunciation sumers, and every consumer has a finite number of appliances stage (between the dashed and the solid line), where per-unit and devices, and every appliance or device has a finite num- price reduces as well for low consuming houses H1 and H2. ber of operating states, there ia a finite number of consump- Large consuming house H3 does not receive the reward large tion level combinations. Since the consumers cannot increase enough to further reduce its consumption. their desired consumption in two consecutive rounds, a rep- In scalability test we gradually increased the number of etition of demand will surely occur. Therefore, negotiation agents involved in the experiment up to 100 000. Linear scal- will end. ability of the mechanism is observed (Figure 3). 111 Ministry of Economic Development and Technology. 14 12 10 8 6 Protocol execution time [s] 4 2 00 20000 40000 60000 80000 100000 Number of agents Figure 2: Cumulative consumption observed during negotia- tion, with final prices added Figure 3: Scatter plot for the scalability experiment References 5 CONCLUSION [1] E. B. Barbier. Economics, natural-resource scarcity and In the paper we presented a hierarchical architecture for development: conventional and alternative views. Rout- resource-demand management in smart cities, which is a nat- ledge, 2013. ural representation of the city and enables distributed compu- tation and robust control. We defined classes of the mech- [2] F. L. Bellifemine, G. Caire, D. Greenwood. Develop- anisms that can be applied to the proposed architecture. In ing multi-agent systems with JADE. John Wiley & Sons, second part of the paper we presented a dynamic mechanism 2007. that encourages the reduction of resource consumption when [3] F. Brazier, F. Cornelissen, R. Gustavsson, C. M. Jonker, resource cost function in convex. It is a two-stage mecha- O. Lindeberg, B. Polak, J. Treur. Agents negotiating for nism that ensures consumer satisfaction in the first stage when load balancing of electricity use. Distributed Computing consumer is truthful and encourages further reduction of con- Systems, 1998. Proceedings. 18th International Confer- sumption in second stage by offering rewards. The mecha- ence on, pp. 622–629, 1998. nism has several desirable properties: it is budget-balanced, converges in finite number of steps, it is strategy proof and [4] A-H. Mohsenian-Rad, V. W. S. Wong, J. Jatskevich, scales linearly with the number of agents that participate in R. Schober, A. Leon-Garcia Autonomous demand-side the mechanism. management based on game-theoretic energy consump- Further work will include the analysis and modelling of the tion scheduling for the future smart grid. Smart Grid, consumer behaviour. The mechanism will then be applied to IEEE Transactions on 1(3), pp. 320–331, 2010. the real world models. The goal of this research is to provide [5] H. Moulin, S. Shenker. Serial cost sharing. Economet- a modular mechanism that will incorporate dynamic demand- rica: Journal of the Econometric Society, pp. 1009–1037, response together with scheduling. Further, it will be able to 1992. deal with: different types of architectures, agents that have hidden information, and agents that reveal all their informa- [6] P. Samadi, H. Mohsenian-Rad, R. Schober, tion. That kind of mechanism will combine optimization and V. W. S. Wong Advanced demand side manage- negotiation in an efficient and universal way. ment for the future smart grid using mechanism design. Smart Grid, IEEE Transactions on 3(3), pp. 1170–1180, 2012. Acknowledgements [7] J. Zupančič, D. Kužnar, B. Kaluža, M. Gams. Two- We thank Gregor Grasselli, Matej Krebelj and Jure Šorn for stage negotiation protocol for lowering the consumption the help with the implementation of the experiments in JADE of convexly priced resources. IAT4SIS ’14, Workshop environment. The research was sponsored by ARTEMIS Joint proceedings on, to be published, 2014. Undertaking, Grant agreement nr. 333020, and Slovenian 112 Indeks avtorjev / Author index Ambrožič Nejc ............................................................................................................................................................................. 74 Aydın Cevat ................................................................................................................................................................................. 78 Bohanec Marko ............................................................................................................................................................................ 62 Bosnić Zoran ................................................................................................................................................................................ 18 Brence Jure ..................................................................................................................................................................................... 5 Černe Matija ................................................................................................................................................................................... 9 Cvetković Božidara ...................................................................................................................................................................... 14 Demšar Jaka ................................................................................................................................................................................. 18 Dovgan Erik ................................................................................................................................................................................. 22 Džeroski Sašo ............................................................................................................................................................................... 97 Filipič Bogdan .................................................................................................................................................................. 22, 66, 74 Frešer Martin ................................................................................................................................................................................ 26 Gams Matjaž ............................................................................................................................................................................... 34 Gams Matjaž .............................................................................................................................. 5, 30, 42, 46, 70, 74, 93, 102, 109 Gantar Klemen ............................................................................................................................................................................. 22 Gjoreski Hristijan ................................................................................................................................................................... 34, 38 Gjoreski Martin ............................................................................................................................................................................ 38 Gosar Žiga ...................................................................................................................................................................................... 5 Grad Janez ................................................................................................................................................................................. 102 Gradišek Anton ............................................................................................................................................................................ 42 Jovan Leon Noe ............................................................................................................................................................................ 46 Kaluža Boštjan ........................................................................................................................................................... 9, 50, 85, 109 Kerkhoff Rutger ........................................................................................................................................................................... 50 Koblar Valentin ............................................................................................................................................................................ 22 Konecki Mario ............................................................................................................................................................................. 54 Kononenko Igor ........................................................................................................................................................................... 18 Košir Igor ..................................................................................................................................................................................... 26 Kulakov Andrea ........................................................................................................................................................................... 38 Kužnar Damjan .................................................................................................................................................................... 46, 109 Luštrek Mitja ...................................................................................................................................... 9, 14, 26, 34, 42, 58, 70, 102 Madevska Bogdanova Ana ........................................................................................................................................................... 89 Martinčič Ipšić Sandra ................................................................................................................................................................. 70 Mihelčić Matej ............................................................................................................................................................................. 62 Mirchevska Violeta ...................................................................................................................................................................... 26 Mlakar Miha ................................................................................................................................................................................. 66 Nikič Svetlana .............................................................................................................................................................................. 46 Piltaver Rok ............................................................................................................................................................................ 70, 74 Sabancı Kadir ............................................................................................................................................................................... 78 Šef Tomaž ........................................................................................................................................................................ 74, 81, 93 Seražin Vid ..................................................................................................................................................................................... 5 Simidjievski Nikola ...................................................................................................................................................................... 97 Slapničar Gašper .......................................................................................................................................................................... 85 Somrak Maja .......................................................................................................................................................................... 42, 58 Šorn Jure....................................................................................................................................................................................... 93 Tashkoski Martin.......................................................................................................................................................................... 89 Tavčar Aleš ...................................................................................................................................................................... 50, 74, 93 Tušar Tea .......................................................................................................................................................................... 66, 74, 93 Vidmar Nina ................................................................................................................................................................................. 97 Yusupov Mukhiddin ................................................................................................................................................................... 102 Zavalnij Bogdan ......................................................................................................................................................................... 105 Zupančič Jernej ...................................................................................................................................................................... 5, 109 113 114 Konferenca / Conference Uredili / Edited by Inteligentni sistemi / Intelligent Systems Rok Piltaver, Matjaž Gams