Zbornik 19. mednarodne multikonference INFORMACIJSKA DRUŽBA - IS 2016 Zvezek A Proceedings of the 19th International Multiconference INFORMATION SOCIETY - IS 2016 Volume A Slovenska konferenca o umetni inteligenci Slovenian Conference on Artificial Intelligence Uredili / Edited by Matjaž Gams, Mitja Luštrek, Rok Piltaver http://is.ijs.si 12. oktober 2016 / 12 October 2016 Ljubljana, Slovenia Zbornik 19. mednarodne multikonference INFORMACIJSKA DRUŽBA – IS 2016 Zvezek A Proceedings of the 19th International Multiconference INFORMATION SOCIETY – IS 2016 Volume A Slovenska konferenca o umetni inteligenci Slovenian Conference on Artificial Intelligence Uredili / Edited by Matjaž Gams, Mitja Luštrek, Rok Piltaver 12. oktober 2016 / 12 October 2016 Ljubljana, Slovenia Uredniki: Matjaž Gams Odsek za inteligentne sisteme Institut »Jožef Stefan«, Ljubljana Mitja Luštrek Odsek za inteligentne sisteme Institut »Jožef Stefan«, Ljubljana Rok Piltaver Odsek za inteligentne sisteme Institut »Jožef Stefan«, Ljubljana Založnik: Institut »Jožef Stefan«, Ljubljana Priprava zbornika: Mitja Lasič, Vesna Lasič, Lana Zemljak Oblikovanje naslovnice: Vesna Lasič Dostop do e-publikacije: http://library.ijs.si/Stacks/Proceedings/InformationSociety Ljubljana, oktober 2016 CIP - Kataložni zapis o publikaciji Narodna in univerzitetna knjižnica, Ljubljana 004.8(082)(0.034.2) SLOVENSKA konferenca o umetni inteligenci (2016 ; Ljubljana) Slovenska konferenca o umetni inteligenci [Elektronski vir] : zbornik 19. mednarodne multikonference Informacijska družba - IS 2016, 12. oktober 2016, [Ljubljana, Slovenija] : zvezek A = Slovenian Conference on Artificial Intelligence : proceedings of the 19th International Multiconference Information Society - IS 2016, 12 October 2016, Ljubljana, Slovenia : volume A / uredili, edited by Matjaž Gams, Mitja Luštrek, Rok Piltaver. - El. zbornik. - Ljubljana : Institut Jožef Stefan, 2016 Način dostopa (URL): http://library.ijs.si/Stacks/Proceedings/InformationSociety/2016/IS2016_Volume_A% 20-%20SKUI.pdf ISBN 978-961-264-097-2 (pdf) 1. Gl. stv. nasl. 2. Vzp. stv. nasl. 3. Gams, Matjaž, računalništvo 4. Mednarodna multikonferenca Informacijska družba (19 ; 2016 ; Ljubljana) 1537202883 PREDGOVOR MULTIKONFERENCI INFORMACIJSKA DRUŽBA 2016 Multikonferenca Informacijska družba (http://is.ijs.si) je z devetnajsto zaporedno prireditvijo osrednji srednjeevropski dogodek na področju informacijske družbe, računalništva in informatike. Letošnja prireditev je ponovno na več lokacijah, osrednji dogodki pa so na Institutu »Jožef Stefan«. Informacijska družba, znanje in umetna inteligenca so spet na razpotju tako same zase kot glede vpliva na človeški razvoj. Se bo eksponentna rast elektronike po Moorovem zakonu nadaljevala ali stagnirala? Bo umetna inteligenca nadaljevala svoj neverjetni razvoj in premagovala ljudi na čedalje več področjih in s tem omogočila razcvet civilizacije, ali pa bo eksponentna rast prebivalstva zlasti v Afriki povzročila zadušitev rasti? Čedalje več pokazateljev kaže v oba ekstrema – da prehajamo v naslednje civilizacijsko obdobje, hkrati pa so planetarni konflikti sodobne družbe čedalje težje obvladljivi. Letos smo v multikonferenco povezali dvanajst odličnih neodvisnih konferenc. Predstavljenih bo okoli 200 predstavitev, povzetkov in referatov v okviru samostojnih konferenc in delavnic. Prireditev bodo spremljale okrogle mize in razprave ter posebni dogodki, kot je svečana podelitev nagrad. Izbrani prispevki bodo izšli tudi v posebni številki revije Informatica, ki se ponaša z 39-letno tradicijo odlične znanstvene revije. Naslednje leto bo torej konferenca praznovala 20 let in revija 40 let, kar je za področje informacijske družbe častitljiv dosežek. Multikonferenco Informacijska družba 2016 sestavljajo naslednje samostojne konference: • 25-letnica prve internetne povezave v Sloveniji • Slovenska konferenca o umetni inteligenci • Kognitivna znanost • Izkopavanje znanja in podatkovna skladišča • Sodelovanje, programska oprema in storitve v informacijski družbi • Vzgoja in izobraževanje v informacijski družbi • Delavnica »EM-zdravje« • Delavnica »E-heritage« • Tretja študentska računalniška konferenca • Računalništvo in informatika: včeraj za jutri • Interakcija človek-računalnik v informacijski družbi • Uporabno teoretično računalništvo (MATCOS 2016). Soorganizatorji in podporniki konference so različne raziskovalne institucije in združenja, med njimi tudi ACM Slovenija, SLAIS, DKZ in druga slovenska nacionalna akademija, Inženirska akademija Slovenije (IAS). V imenu organizatorjev konference se zahvaljujemo združenjem in inštitucijam, še posebej pa udeležencem za njihove dragocene prispevke in priložnost, da z nami delijo svoje izkušnje o informacijski družbi. Zahvaljujemo se tudi recenzentom za njihovo pomoč pri recenziranju. V 2016 bomo četrtič podelili nagrado za življenjske dosežke v čast Donalda Michija in Alana Turinga. Nagrado Michie-Turing za izjemen življenjski prispevek k razvoju in promociji informacijske družbe bo prejel prof. dr. Tomaž Pisanski. Priznanje za dosežek leta bo pripadlo prof. dr. Blažu Zupanu. Že šestič podeljujemo nagradi »informacijska limona« in »informacijska jagoda« za najbolj (ne)uspešne poteze v zvezi z informacijsko družbo. Limono je dobilo ponovno padanje Slovenije na lestvicah informacijske družbe, jagodo pa informacijska podpora Pediatrične klinike. Čestitke nagrajencem! Bojan Orel, predsednik programskega odbora Matjaž Gams, predsednik organizacijskega odbora i FOREWORD - INFORMATION SOCIETY 2016 In its 19th year, the Information Society Multiconference (http://is.ijs.si) remains one of the leading conferences in Central Europe devoted to information society, computer science and informatics. In 2016 it is organized at various locations, with the main events at the Jožef Stefan Institute. The pace of progress of information society, knowledge and artificial intelligence is speeding up, but it seems we are again at a turning point. Will the progress of electronics continue according to the Moore’s law or will it start stagnating? Will AI continue to outperform humans at more and more activities and in this way enable the predicted unseen human progress, or will the growth of human population in particular in Africa cause global decline? Both extremes seem more and more likely – fantastic human progress and planetary decline caused by humans destroying our environment and each other. The Multiconference is running in parallel sessions with 200 presentations of scientific papers at twelve conferences, round tables, workshops and award ceremonies. Selected papers will be published in the Informatica journal, which has 39 years of tradition of excellent research publication. Next year, the conference will celebrate 20 years and the journal 40 years – a remarkable achievement. The Information Society 2016 Multiconference consists of the following conferences: • 25th Anniversary of First Internet Connection in Slovenia • Slovenian Conference on Artificial Intelligence • Cognitive Science • Data Mining and Data Warehouses • Collaboration, Software and Services in Information Society • Education in Information Society • Workshop Electronic and Mobile Health • Workshop »E-heritage« • 3st Student Computer Science Research Conference • Computer Science and Informatics: Yesterday for Tomorrow • Human-Computer Interaction in Information Society • Middle-European Conference on Applied Theoretical Computer Science (Matcos 2016) The Multiconference is co-organized and supported by several major research institutions and societies, among them ACM Slovenia, i.e. the Slovenian chapter of the ACM, SLAIS, DKZ and the second national engineering academy, the Slovenian Engineering Academy. In the name of the conference organizers we thank all the societies and institutions, and particularly all the participants for their valuable contribution and their interest in this event, and the reviewers for their thorough reviews. For the fourth year, the award for life-long outstanding contributions will be delivered in memory of Donald Michie and Alan Turing. The Michie-Turing award will be given to Prof. Tomaž Pisanski for his life-long outstanding contribution to the development and promotion of information society in our country. In addition, an award for current achievements will be given to Prof. Blaž Zupan. The information lemon goes to another fall in the Slovenian international ratings on information society, while the information strawberry is awarded for the information system at the Pediatric Clinic. Congratulations! Bojan Orel, Programme Committee Chair Matjaž Gams, Organizing Committee Chair ii KONFERENČNI ODBORI CONFERENCE COMMITTEES International Programme Committee Organizing Committee Vladimir Bajic, South Africa Matjaž Gams, chair Heiner Benking, Germany Mitja Luštrek Se Woo Cheon, South Korea Lana Zemljak Howie Firth, UK Vesna Koricki Olga Fomichova, Russia Mitja Lasič Vladimir Fomichov, Russia Robert Blatnik Vesna Hljuz Dobric, Croatia Aleš Tavčar Alfred Inselberg, Israel Blaž Mahnič Jay Liebowitz, USA Jure Šorn Huan Liu, Singapore Mario Konecki Henz Martin, Germany Marcin Paprzycki, USA Karl Pribram, USA Claude Sammut, Australia Jiri Wiedermann, Czech Republic Xindong Wu, USA Yiming Ye, USA Ning Zhong, USA Wray Buntine, Australia Bezalel Gavish, USA Gal A. Kaminka, Israel Mike Bain, Australia Michela Milano, Italy Derong Liu, Chicago, USA Toby Walsh, Australia Programme Committee Bojan Orel, chair Andrej Gams Vladislav Rajkovič Grega Nikolaj Zimic, co-chair Matjaž Gams Repovš Franc Solina, co-chair Marko Grobelnik Ivan Rozman Viljan Mahnič, co-chair Nikola Guid Niko Schlamberger Cene Bavec, co-chair Marjan Heričko Stanko Strmčnik Tomaž Kalin, co-chair Borka Jerman Blažič Džonova Jurij Šilc Jozsef Györkös, co-chair Gorazd Kandus Jurij Tasič Tadej Bajd Urban Kordeš Denis Trček Jaroslav Berce Marjan Krisper Andrej Ule Mojca Bernik Andrej Kuščer Tanja Urbančič Marko Bohanec Jadran Lenarčič Boštjan Vilfan Ivan Bratko Borut Likar Baldomir Zajc Andrej Brodnik Janez Malačič Blaž Zupan Dušan Caf Olga Markič Boris Žemva Saša Divjak Dunja Mladenič Leon Žlajpah Tomaž Erjavec Franc Novak Bogdan Filipič iii iv KAZALO / TABLE OF CONTENTS Ime konference SLO / Ime konference ENG ............................................................................................................. 1 PREDGOVOR / FOREWORD ................................................................................................................................. 3 PROGRAMSKI ODBORI / PROGRAMME COMMITTEES ..................................................................................... 4 Usage of SIGMO, a Decision Support System for the Assessment of Genetical y Modified Organisms in Food and Feed / Mileva Boshkoska Biljana, Bohanec Marko, W. Prins Theo, J. Kok Esther .......................... 5 Paral el Draws from the Polya-Gamma Distribution for Faster Bayesian Multinomial and Count Model Inference / Češnovar Rok, Štrumbelj Erik ......................................................................................................... 9 Fit4Work – Monitoring and Management of Physical, Mental and Environmental Stress at Work / Cvetković Božidara, Gjoreski Martin, Frešer Martin, Kosiedowski Michał, Luštrek Mitja ................................. 13 Bayesian Binary and Ordinal Regression with Structured Uncertainty in the Inputs / Dimitriev Aleksandar, Štrumbelj Erik ................................................................................................................................................... 17 Multiobjective Discovery of Driving Strategies Using the SCANeR Studio / Dovgan Erik ................................... 21 Ali nam metode strojnega učenja lahko pomagajo pri načrtovanju novih visokoentropijskih zlitin? / Gradišek Anton, Kocjan Andraž, Mlakar Miha ................................................................................................. 25 Markov Chain Model for Energy-Efficient Context Recognition / Janko Vito, Luštrek Mitja ................................. 28 Reliability Estimation of Individual Predictions for Incremental Models / Javornik Anže, Kononenko Igor, Pevec Darko ..................................................................................................................................................... 32 Dve nadgradnji algoritma backpropagation / Ploj Bojan, Terbuc Martin .............................................................. 36 Ebook Recommendations Based on Stylometric Features / Žitnik Lovro, Robnik-Šikonja Marko ....................... 40 Model Selection on the JSI Grid: Metis Use-Case / Zupančič Jernej, Kužnar Damjan, Gams Matjaž ................ 44 Sinteza slovenskega govora na mobilni platformi Android / Šef Tomaž .............................................................. 48 Showing the Knee of a 4-D Pareto Front Approximation via Different Visualization Methods / Tušar Tea, Filipič Bogdan ................................................................................................................................................... 52 Machine Learning Method for Stress Detection with an EEG Device / Gjoreski Martin, Luštrek Mitja, Gams Matjaž .................................................................................................................................................... 56 Design of Mobile Systems that Enable an Active and Meaningful Life for Dementia Patients / Fabjan David................................................................................................................................................................. 60 SeaAbs: Developing a Low Cost Seafloor Observatory / Kljun Matjaž, Tošić Aleksandar .................................. 64 Primerjava razdalj pri klasifikaciji in regresiji / Stepišnik Perdih Tomaž, Panov Panče, Bauer Andrej, Džerovski Saso ................................................................................................................................................. 68 Avtomatska prepoznava teniških udarcev iz senzorskih podatkov / Mlakar Miha ............................................... 72 Indeks avtorjev / Author index ................................................................................................................................ 77 v vi Zbornik 19. mednarodne multikonference INFORMACIJSKA DRUŽBA – IS 2016 Zvezek A Proceedings of the 19th International Multiconference INFORMATION SOCIETY – IS 2016 Volume A Slovenska konferenca o umetni inteligenci Slovenian Conference on Artificial Intelligence Uredili / Edited by Matjaž Gams, Mitja Luštrek, Rok Piltaver 12. oktober 2016 / 12 October 2016 Ljubljana, Slovenia 1 2 PREDGOVOR Slovenska konferenca o umetni inteligenci je naslednica konference Inteligentni sistemi, ki je sestavni del multikonference Informacijska družba že od njenega začetka leta 1997. Za novo ime smo se odločili zaradi tesnejšega sodelovanje s Slovenskim društvom za umetno inteligenco. Pomemben cilj tako konference kot tudi društva je namreč povezovanje slovenskih raziskovalcev s tega področja, čeprav na konferenci niso nič manj dobrodošli tudi prispevki iz drugih držav. Žal letos nismo prejeli dovolj kvalitetnih prispevkov iz tujine, smo pa pritegnili nadpovprečen delež prispevkov iz organizacij izven Instituta »Jožef Stefan«, posebej s Fakultete za računalništvo in informatiko, ki ima skupaj z Institutom vodilno vlogo pri raziskavah umetne inteligence v Sloveniji, kar štejemo za uspeh. Tema konference ostajajo umetna inteligenca, inteligentni sistemi in inteligentne storitve informacijske družbe. Obravnavamo metode umetne inteligence, njene aplikacije v praksi, vpliv na družbo in gospodarstvo, njeno prihodnost ter priložnosti in nevarnosti, ki jih vnaša v informacijsko družbo. Ker konferenca stremi h kakovostnim mednarodno zanimivim prispevkom, je večina izmed 18 sprejetih v angleščini; a ker gre vendar za slovensko konferenco, ki želi spodbuditi dialog in sodelovanje med domačimi raziskovalci, sprejemamo tudi slovenske. Vsi prispevki so bili recenzirani in popravljeni v skladu z navodili recenzentov. Menimo, da je letošnja kakovost prispevkov visoka in upamo na še višjo prihodnje leto, ko bo konferenca praznovala 20. obletnico. FOREWORD Slovenian Conference on Artificial Intelligence is the successor of the Intelligent Systems conference, which has been a part of the Information Society multiconference since its establishment in 1997. The new name was adopted because of a closer collaboration with the Slovenian Artificial Intelligence Society. The reason for this collaboration is that the goal of both the conference and the society is bringing together Slovenian artificial intelligence researchers, although international papers are just as welcome at the conference. Unfortunately we have not received enough such papers this year, but we are proud to have attracted an unusually large number of papers from outside the Jožef Stefan Institute, particularly from the Faculty of Computer and Information Science, which shares the leading role in artificial intelligence research in Slovenia with the Institute. The topics of the conference remain artificial intelligence, intelligent systems and intelligent services in the information society. The conference addresses artificial intelligence methods, their practical application, the impact of the artificial intelligence on the society and industry, its future, and the opportunities and threats it presents to the information society. Since the conference aims at international relevance, the majority of the 18 accepted papers are in English; however, being a Slovenian conference aiming to open a dialogue and collaboration between national researchers, papers in Slovenian are also accepted. All the papers were reviewed and revised according to the reviewers' comments. The quality of this year's papers is high, and we hope for an even higher one next year when the conference will celebrate its 20th anniversary. Matjaž Gams, Mitja Luštrek, Rok Piltaver 3 PROGRAMSKI ODBOR / PROGRAMME COMMITTEE Mitja Luštrek, IJS (co-chair) Matjaž Gams, IJS (co-chair) Rok Piltaver, IJS (co-chair) Marko Bohanec, IJS Tomaž Banovec, Statistični zavod Cene Bavec, UP, FM Jaro Berce, UL, FDV Marko Bonač, ARNES Ivan Bratko, UL, FRI in IJS Dušan Caf, FIŠ Bojan Cestnik, Temida Aleš Dobnikar, CVI Bogdan Filipič, IJS Nikola Guid, FERI Borka Jerman Blažič, IJS Tomaž Kalin, DANTE Marjan Krisper, FRI Marjan Mernik, FERI Vladislav Rajkovič, FOV Ivo Rozman, FERI Nikolaj Schlamberger, Informatika Tomaž Seljak, IZUM Miha Smolnikar, IJS Peter Stanovnik, IER Damjan Strnad, FERI Peter Tancig, RZ Pavle Trdan, Lek Iztok Valenčič, GZ Vasja Vehovar, FDV Martin Žnidaršič, IJS 4 Usage of SIGMO, a Decision Support System for the Assessment of Genetically Modified Organisms in Food and Feed Biljana Mileva Marko Bohanec Theo W. Prins Esther J. Kok Boshkoska Jožef Stefan Institute RIKILT Wageningen RIKILT Wageningen Faculty of information Jamova 39, Ljubljana, University & Research University & Research studies in Novo mesto, and 6700 AE Wageningen, 6700 AE Wageningen, Ljubljanska cesta 32, University of Nova Netherlands Netherlands Novo mesto, and Gorica theo.prins@wur.nl Jožef Stefan Institute marko.bohanec@ij esther.kok@wur.nl Jamova 39, Ljubljana s.si biljana.mileva@ijs. si ABSTRACT products can reach the market and then be further on inspected. In SIGMO is a web-based decision support system for assessing the cases of finding unauthorized GMOs in products, these will be probability of existence of a genetically modified organisms in food withdrawn from the market, leading to significant financial losses and feed products imported in the European Union. The system has for the traders. To assist the traders and producers of complex been developed in the frame of the EU FP7 project Decathlon. In products, whose ingredients may contain GMOs, a decision support this paper, we describe SIGMO, provide examples of how to use it system called SIGMO (System for Identification of GMOs) has for training and evaluation purposes, and report the statistics of its been developed within the European Framework 7 project actual use. ‘Decathlon’ (http://www.decathlon-project.eu/). In this paper, we describe SIGMO, illustrate its application for training and Keywords evaluation through several examples, and report statistics of using decision support system, qualitative multi-criteria modeling, the system since it has been made available on-line. genetically modified organisms, SIGMO 1. INTRODUCTION 2. SIGMO The import of genetically modified organisms (GMOs) in the SIGMO [6] is a web-based decision support system (DSS) that has European Union (EU) has begun in 1994, and since then EU has been designed to provide help to producers, traders and importers developed different monitoring mechanisms to assess the risk of with the aims to: entering any GMO in its market. The goal of the risk assessment is • reduce the number of necessary GMO analyses; to ensure that imported food or feed is safe for human and animal health and the environment. Hence, the EU treats all genetically • better cope with the complexity of GMO market without modified crops (GMO crops) as "a modified food". All GMO crops requiring its users to have extensive knowledge on the that are intended to be admitted to the EU, are extensively evaluated world-wide production; by the European Food Safety Authority (EFSA) [1], which reports its findings to the European Commission (EC). The EC then • comply with EU GMO regulations in a cost-effective way. proposes whether to grant or refuse the authorisation for the GMO [2]. In the EU, currently (September 2016) 68 GMOs are authorised SIGMO is composed of: in cotton (10 GMOs), maize (36), oil seed rape (6), soybean (15), and sugar beet (1) [3]. Despite the large number of authorized • a data base providing data about GMO crop species GMOs, the import of non-authorized GMOs is still being detected produced and approved in counties worldwide; in EU, in particular from the US, Canada, Argentina, China and • a multi-attribute model for the assessment of GMO Brazil [4]. Whenever a product that contains GMO is imported, this presence in food/feed products; and information needs to be provided on the product label. There is a threshold of 0.9% for the adventitious and technically unavoidable • an on-line user interface available at presence of authorized GMOs in non-GMO batches that do not www.decathlon.ijs.si/gmo/. require labelling [5, 6]. Regardless of the strict measures, it still happens that products containing GMOs without labeling are imported into the EU. These 5 2.1 SIGMO database Figure 1. Hierarchical structure of the SIGMO multi-attribute model The SIGMO database consists of tree tables that hold data on The SIGMO model, as given in Figure 1, is implemented in DEXi countries, the list of possible status types that a GMO-containing software package [8]. To assess the likelihood of GMO presence in product may obtain regarding the GM presence, such as “High a product, the model is at the highest level decomposed into two likelihood”, “Medium likelihood”, and “Low likelihood”; and a list main subtrees: of all currently listed GMOs species. These tree tables are connected with three relations. The first relation, which defines the • TraceabilityData – data that accompanies each product and describes its geographical origin and path in the area of a particular crop planted in a certain country, determines if a certain GMO/country pair belongs to a region of increased GMO production/supply chain [9], and production. The second relation holds information regarding each • AnayticalData – provide information about the approved particular GMO event. For example, maize may have several gene and/or unapproved GMOs already detected in the modifications, each one having a different Event name. Such names product, and about the methods used to analyze the would be “MON810” encoded as “MONØØ81Ø-6”. Finally, the products for GMO content [10, 11]. third one defines a many-to-many relation between Events and Usually analytical data are rarely available with a product, but if Countries, associated with a particular status type. At the current they are available, they are potentially more relevant for assessing implementation, the database holds over 300 GMOs. the possible presence of (unauthorized) GMOs than traceability data and should thus take precedence. 2.2 SIGMO multi-attribute decision model A more detailed description of attributes and utility functions of the The central part of SIGMO is a multi-attribute decision model that model are given in [6]. provides an assessment of the potential presence of GMOs in imported feed or food products. The model has been developed using the methodology DEX (Decision Expert) [7]. DEX belongs 2.3 SIGMO user interface to the group of qualitative multi-criteria decision making methods. In DEX, a hierarchical structure containing qualitative attributes is When using SIGMO (http://decathlon.ijs.si/gmo/), the user is built which represents a decomposition of the decision problem into presented with an interactive input form, such as the one presented smaller, less complex and possibly easier to solve sub-problems. in Figure 2, in which the user enters data about the product to be There are two types of attributes in DEX: basic and aggregated. The assessed. The system provides guidance through drop-down data- former are the directly measurable attributes, which are used for entry lists and info buttons on the right hand side of each data-entry describing the decision options and represent input data to the field. model. The latter are obtained by aggregating the basic and/or other aggregated attributes. They represent the evaluations of the options. The hierarchical structure represents the dependencies among attributes such that the higher-level attributes depend on their immediate descendants in the tree. This dependency is defined by a utility function by the expert. Attribute tree Attribute Description GM_Presence Assessment of GM presence in food/feed products TraceabilityData GM presence due to traceability data Products GM presence due to product characteristics ProductGMPresence Assessment of GM presence in the product CropGMPresence Assessment of GM presence related to the crop GeoGMPresence Assessment of GM presence related to the geographical origin of the product EU Does the product originate in an EU country? GM_Region Does the product originate in a region of large GMO production? ProductComplexity Product type Countries Likelihood of GM presence due to the properties of countries and regions of origin NumberCountries Number of countries involved in storage CountryGMPresence Likelihood of GM presence with respect to involved countries CoexistenceMeasures Are coexistence measures in place in countries? Transportation Likelihood of GM presence due to transportation PrepackagedProduct Is product prepackaged? Logistics Likelihood of GM presence due to logistics LogComplexity GM presence due to logistics complexity NumberInteractions Number of interactions in the supply chain NumberCompanies Number of companies involved in logistics LogStorage Likelihood of GM presence due to storage used Harbour Has the product been shipped through harbour(s)? Silo Has the product been stored in silos? SystemsUsed GM presence due to used traceability systems TraceabilitySystemInPlace Is a traceability system in place? IP_GMO Are IP systems for GMO being used? IP_Other Are other IP systems being used? AnalCtrl_Systems Are there systems used that include analytical control? PrivateContracts Are there any private contracts? AnalyticalData GM presence due to analytical data AnalyticalResults GM presence according to analytical results AnalyticalResultsAvailable Are analytical results available? ApprovedGMOsIdentified Approved GMOs identified UnapprovedGMOsIdentified Unapproved GMOs identified Methods Risk due to applied methods Figure 2. SIGMO user interface ProcessingLevel Processing level AppropriateSampling Have appropriate sampling methods been used? AppropriateMethods Have appropriate analysis methods been used? Additionally, some parts of the input form are optional and open up Reliability Reliability of applied methods ReliabilityForApprovedGMO Reliability of applied methods to detect approved GMOs RelevantGMCropsIncluded Did the analysis include al the relevant GMO crops? only when necessary. Such examples are: Al IngredientsIncluded Are al ingredients, listed on the product label, included in the analysis? OmnipresentGMOsIncluded Are omnipresent GMO varieties included in the analysis? NumberScreenElem Number of screening elements ReliabilityForUnapprovedGMO Reliability of applied methods to detect unapproved GMOs • only when Product Complexity is “complex”, more than NumberScreenElem Number of screening elements AppropriateDataAnalysis Application of appropriate data analysis methods one ingredient can be entered; AppliedQualitySystem Can we trust the applied methods and analytical lab? ValidatedMethods Application of validated analytical methods AccreditedLab Is analytical laboratory acredited? • when Analytical results available is “yes”, further detail about analytical results are requested; 6 • for products that are not pre-packaged, further data for Figure 3. Cream Style Corn product from Thailand. Available transportation is requested; from https://www.alibaba.com/product-detail/Cream-Style- Corn_131818000.html After the data has been entered, pressing the button “Evaluate” shows the output page that displays the results of assessment. These results can be further explored by “drilling down” the evaluation 3.3 Example 3 tree. In this example we consider products that are made of several SIGMO also provides a print page, which is suitable for saving and ingredients, each of which may have different probability of GMO printing the evaluation results, and a help page describing the presence. For example, one may be tempted to buy pasta made form meaning of input data fields. maize and rice that are produced in Philippines. To assess the probability of presence of GMOs, we use SIGMO to evaluate both The system’s architecture was designed by employing concepts that ingredients separately regarding the possibility of GMOs. As a final will allow seamless scalability. In that manner, the system is result, SIGMO aggregates the outputs of the evaluations into one depends on MongoBD and MySql. The application is Django web recommended answer. For the provided example, SIGMO application that runs on Apache through mod_wgsi. As such the evaluates the possibility of unauthorized genes in the selected application can be easily scaled in order to accommodate products as high. However, SIGMO also considers other attributes. substantial traffic increase. Considering the fact that the product is packed, the possibility of commingling with other maize or rice is none. Therefore, the 3. USAGE OF THE SIGMO DSS system evaluates the risk of GMOs in the product as medium to low. Here we present several examples of how to use the SIGMO DSS in order to answer important GMO related questions about 3.4 Example 4 imported feed and food products. Consider that one wants to find out the GMO presence in complex products consisting from rice and papaya produced in China. The 3.1 Example 1 product is packed, and comes with a documentation from which the In the first example we answer the following question: given the user may see that the product provides Analytical results from a origin of a certain product that consists of only one ingredient (for laboratory. The results show that no Approved or Unapproved example popcorn), can one find all crops that may contain genetic GMOs were identified by the laboratory. Hence a new drop-down modifications? To answer this question we consider a product that menu is shown that asks questions regarding the laboratory results. is made of crops, and that originates from Indonesia. Running The user answers as follows: “yes” for GM presence in country , SIGMO and checking the Crop Species field, we get the list of all “moderate” for Processing Level, “yes” for Appropriate Sampling, crops that are produced in Indonesia and which potentially may “yes” for All Ingredients Included, “yes” for Omnipresent GMOs contain unauthorized gene modifications in EU. The list contains Included, “many” for Number of Screening Elements, “yes” for Maize, Soybean and Sugarcane. Appropriate Data Analysis, “yes” for Validated Methods, “yes” for Accredited Lab. The next question is Is product pre-packed to which the user says “no”. Hence, the user is asked a new set of question about the product transportation. The answers are “few” 3.2 Example 2 for Number of interactions, “no” for Harbour, and “no” for Silo. Another relevant question when importing products from outside the EU is whether the country of origin has large fields that are Given these data, SIGMO provides information on a-priori GM populated with certain GMOs thus providing higher possibility for presence in the product which is “high” for rice and “med” for importing them in EU. For example, using the SIGMO model, one papaya. Still, SIGMO’s final assessment of GMO likelihood is may be interested whether or not Cream Style Corn canned product v_low. The rationale is that the product has been inspected by a (see Error! Reference source not found. ) that contains maize and laboratory that did not find any authorized or unauthorized GMOs, is produced in Thailand, is produced in a region of large GMO and all additional questions regarding the laboratory indicate that production. By selecting the appropriate values for Country and the product has been properly inspected. Although the product has Crop Species, SIGMO provides the result no. not been packaged, it also has neither been stored in silos nor had other interactions with other products at harbors. Thus, the final recommendation from SIGMO is that there is a very low probability that the product contains GMOs. 3.5 Usage of SIGMO So far, SIGMO has been presented twice during the Decathlon project to project participants, traders, producers and importers: once in Lisbon, Portugal in 2015, and once in Shanghai, China in 2016. To measure the impact and usage of the system we have anonymously collected the country of accession of the SIGMO’s web page. A histogram of different accesses to SIGMO in the 7 period of 15.1. – 15.7.2016 is given in Figure 4. From the histogram 5. ACKNOWLEDGMENTS we can see that although the system has been presented only to The research leading to these results has received funding from the participants from involved project countries European Union’s Seventh Framework Programme for research, (http://www.decathlon-project.eu/article/consortium), SIGMO has technological development and demonstration under grant been used worldwide. Additionally, we present the monthly agreement FP7-KBBE-2013-7-613908-Decathlon distributions of web-accesses of SIGMO in the period from (http://www.decathlon-project.eu). December 2015 to September 2016 in Figure 5. The histogram shows increases of web-access from less than 100 in December, 6. REFERENCES 2015 to more than 800 in July, 2016. [1] European Food Safety Authority. 2016. http://www.efsa.europa.eu/. [2] European Commission Fact Sheet. 2015. Questions and Answers on EU's policies on GMOs Press release Database," http://europa.eu/rapid/press-release_MEMO-15- 4778_en.htm. [3] European Commission. 2016. Genetically Modified Organisams. EU register of authorised GMOs. http://ec.europa.eu/food/dyna/gm_register/index_en.cfm. [4] European Commission. 2016. RASFF - Food and Feed Safety Alerts. http://ec.europa.eu/food/safety/rasff/index_en.htm. [5] European Commission. 2003. Regulation (EC) No 1829/2003 of the European parliament and of the council of Figure 4. Different accesses points to SIGMO in the period of 22 September 2003 on genetically modified food and feed. January, 2015 to July, 2016 Official Journal of the European Union, L268, 1 - 23. [6] M. Bohanec, B. M. Boshkoska, T. W. Prins and E. J. Kok. 2016. SIGMO: A decision support System for Identification of genetically modified food or feed products. Food Control, 168–177. [7] M. Bohanec, V. Rajkovič, I. Bratko, B. Zupan and M. Žnidaršič. 2013. DEX methodology: Three decades of qualitative multi-attribute modelling. Informatica, 49 - 54. [8] M. Bohanec. 2015. DEXi: Program for multi-attribute decision making, User’s manual, version 5.00. Ljubljana: Jožef Stefan Institute. IJS Report DP-11897. Ljubljana. [9] European Commission. 2013. Regulation No. 1830/2003 concerning the traceability and labelling of genetically modified organisms and the traceability of food and feed Figure 5. Distribution of SIGMO web-site visits per month in products produced from genetically modified organisms and the period from December, 2015 to August 2016 amending Directive 2001/18/EC. 4. CONCLUSION [10] A. Holst Jensen, 2009. Testing for genetically modified The main goal of SIGMO is to help traders, producers and organisms (GMOs): Past,present and future perspectives. importers to assess the probability of existence of (un)authorized Biotechnology Advances, 27, 71 - 1082. GMOs in the feed and food products. Here we presented four [11] A. J.Arulandhu, J. P. V. Dijk, D. Dobnik, A. Holst-Jensen, J. examples of how to use the system to achieve its goal. In addition Shi and J. Zel. 2016. Critical review: DNA enrichment we presented the number of accesses to the system in a six month approaches to identify unauthorised genetically modified period. Although the system has been presented only in Portugal organisms (GMOs). Analytical and Bioanalytical Chemistry. and in China, it has been accessed from 22 countries. The presented examples can be used in future also for demonstration purposes to students studying for example decision sciences, in particular the DEX methodology. 8 Parallel Draws from the Polya-Gamma Distribution for Faster Bayesian Multinomial and Count Model Inference Rok Češnovar Erik Štrumbelj University of Ljubljana University of Ljubljana Faculty of Computer and Information Science Faculty of Computer and Information Science Večna pot 113 Večna pot 113 Ljubljana, Slovenia Ljubljana, Slovenia rok.cesnovar@fri.uni-lj.si erik.strumbelj@fri.uni-lj.si ABSTRACT logistic regression is a special case of Eq. (1) with all ni = We propose an algorithm for drawing random variates from 1. Typically, Laplace approximation [1, p. 213] or the the two-parameter Polya-Gamma distribution PG(h, z) in Metropolis algorithm [1, p. 541] are used to infer from parallel on GPU, when h is integer. This distribution plays this model. Note that β are the k model coefficients, and an important role in Gibbs-sampling schemes for inference µ0, Σ0 the prior mean vector and prior covariance matrix, from Bayesian categorical and count models. We demon- respectively. strate speedups of the order of 100 on mid-range and of the order of 10 on low-end GPU and how these speedups trans- Recently, Polson et al. [9] proposed a data-augmentation late to similar speedups when the sampler is used in Gibbs scheme that leads to a simple Gibbs sampler for the posterior sampling for binomial regression. of the Bayesian Binomial regression model Eq. (1) w Keywords i|β ∼ PG(ni, βT xi) (2) β|w ∼ N OpenCL, random variables, sampling, Bayesian inference k (µw , Σw ) where µw = Σw(XT κ+Σ−1µ µ 0 0), Σw = (X T W X +Σ−1 0 0)−1, 1. INTRODUCTION κ = y − ni , W is the diagonal matrix of w, and PG(h, z) 2 Bayesian methods play an important role in modern statis- is the Polya-Gamma distribution, which is the focus of this tics and machine learning due to their robustness and con- paper and will be discussed in more detail in Section 2. ceptual simplicity. The main drawback of Bayesian methods is that it is often analytically and computationally difficult This data-augmentation scheme in Eq. (2) is more efficient to infer from the posterior distribution. Instead, we typically than Metropolis samplers and other data-approaches (see resort to structural approximation or, when we are inter- [9] for details) and, which is even more important, easily ex- ested in the full posterior density, computationally-intensive tends to other categorical and count models, such as Multi- sampling-based approximation, in particular, Markov Chain nomial Logistic regression and Negative Binomial regression. Monte Carlo (MCMC). However, the Polya-Gamma distribution is not easy to sam- For example, take the binomial regression setting with n ple from and can be a performance bottleneck, in particular samples and k input variables. Let yi ∈ {0, 1} be the de- when h is large. In this paper we propose an algorithm for pendent variable, xi ∈ Rk the input variables, and ni the sampling from PG(h, z) in parallel on GPU, when h is in- number of attempts for the i−th sample, i = 1..n. Even for teger, which is sufficiently general for categorical and count a standard and relatively simple model, such as the Bayesian models. We demonstrate its utility as a stand-alone random binomial regression model variable generator and when used as part of a Gibbs sampler for Bayesian inference. 1 yi ∼ Binomial ni, 1 + e−βTxi (1) 2. PG DISTRIBUTION AND SAMPLING β ∼ Nk(µ0, Σ0) In this section we briefly review the properties of the PG neither the joint posterior nor the full conditional posterior distribution relevant to our work. The Polya-Gamma distri- distributions can be derived analytically. Note that Bayesian bution PG(h, z) is a two-parameter continuous distribution (h > 0, z ∈ R). The first important property of the PG distribution is that if X ∼ PG(hx, z) and Y ∼ PG(hy, z) then X + Y is distributed PG(hx + hy, z). Therefore, for integer h, a random variate from PG(h, z) can be generated using h random variates from PG(1, z). The above property allows us to focus on sampling from PG(1, z). Our parallel algorithm is based on the algorithm from [9], which is based on sampling from the (exponentially tilted) Jacobi distribution J∗(1,z) [4] and the relationship 9 that if X ∼ J∗(1,z/2) then Y = X/4 ∼ PG(1, z). Therefore, parallelizing for the GPU, we are dealing with the trade-off our sampling problem reduces to sampling from J∗(1,z). between the demands for a very large number of threads and the demand that the threads have a large enough workload The density f (x) of a J∗(1,z) distributed random variable so that the cost of creating threads does not outweigh the contains an infinite sum and can only be approximated (the benefits of parallel execution. We also want to have simi- same holds for the density of PG(1,z), of course), which lar workloads in the threads, because uneven workloads are makes sampling difficult. However, it has an alternating huge performance drawbacks in GPU programming. series representation The naive approach would be to split the problem, so that ∞ X z2x f (x) = (−1)icosh(z) exp − a each sampling i from PG(hi, zi) is a task. This way each 2 i(x), i=0 task is independent, therefore requiring no communication. However, we could have the problem of uneven workloads, since threads that would perform draws with large hi would  n o have a substantially higher workload than threads with small π(i + 1/2)( 2 )3/2 exp − 2(i+1/2)2 0 < x < t a πx x h i(x) = i. Furthermore, with small R, the low number of threads n o π(i + 1/2) exp − 2(i+1/2)2π2 x x > t, would not fully utilize GPUs performance. 2 where the optimal choice of t = 0.64 [4]. To deal with these drawbacks we define a task to be a single draw from PG(1, zi). The final draws PG(hi, zi) are calcu- The partial sums of the series Si(x) = Pi (−1)ja j=0 j (x) sat- lated on the CPU by summing hi samples from PG(1, zi), isfy as explained in Section 2. This way tasks are more ho- S mogeneous and the workload of threads more even, as the 0(x) > S2(x) > · · · > f (x) > · · · S3(x) > S1(x), computational complexity of PG(1, zi) is practically inde- which facilitates a modified rejection algorithm for sampling pendent of zi. The number of tasks in this parallelization is from f (x). The problem that f (x) can not be evaluated is Rtask = Σli=1rihi. Note that Rtask > R if there are hi > 1, circumvented by iteratively calculating the partial sums and so we are able to better utilize the GPU when R is small. stopping early using the above property of Si(x). Given an upper hull (envelope) function g(x) > f (x), we can sample The homogeneity of tasks also gives us the ability to easily from f (x) by sampling from g(x) and accepting each sam- group them in larger but still homogeneous workloads. Each ples x0 with probability f (x0)/q(x0), which can be done by thread thus performs batches of B draws from PG(1, zi) sampling U ∼ U (0, g(x0)) and checking U < f (x0). Due to instead of single draws. This ability is useful when searching the property of the partial sums, we know that all odd terms for the optimal point in the aforementioned trade-off. For are less than f (x) - therefore, if U < Si(x0) for odd i, we example, when Rtask is much greater than the number of can deduce U < f (x) and accept the sample. Similarly, if execution units C on the GPU. If each thread would execute U > Si(x0) for even i, we can deduce U > f (x) and reject a single draw, the benefits of parallelization would be small the sample. as Rtask-C threads would be assigned to the queue. The cost of creating tasks would therefore outweigh these benefits. In The first partial sum S0(x) is a natural (and efficient [9]) cases when C > Rtask, we could define B = 1, thus utilizing choice for the upper hull function g(x) and it can be sampled all the execution units. With the ability to easily fine-tune from a mixture of a truncated inverse-Gaussian distribution B we can get the best performance from GPUs of different and a truncated exponential distribution vendors and architectures. Note that parallelization within ( the execution of single draws from P G(1, z IG(|z|−1, 1)I i) is not feasible X ∼ (0,t] with prob. p see [9] as these parallel tasks would require a lot of communication, Exp(−z2/2 + π2/8)I(t,∞] with prob.1 − p. outweighing the benefits of parallel execution. Left-truncated exponential variates can be generated triv- ially by translating and scaling a Exp(1) variate. Inverse- 3.1 Pseudo-Random number generation and Gaussian variates are sampled using the method of [5], which sampling from probability distributions is based on Exp(1) and standard normal N(0, 1) variates. For further details on the above algorithm for sampling from The algorithm for drawing from PG(1, z) requires random J∗(1,z) and therefore PG(1, z) see [9]. Note that other, faster variates from distributions U (0, 1), N (0, 1), and Exp(1). We approximate approaches for sampling from PG have been implemented the XORShift128 [7] uniform random number developed [11], but are unstable for larger h. generator, as this generator is one of the fastest on the GPU and has satisfactory statistical properties [3, 6]. Unit uni- 3. PARALLEL SAMPLING ON GPU form variates U (0, 1) are generated trivially, by dividing the generated number with the maximum possible uniform ran- The basic case of sampling from the distribution is draw- dom number. Standard normal variates N (0, 1) are gener- ing r samples from PG(h, z). In this paper we focus on a ated using U (0, 1) variates and the Box-Mueller transform more generalized problem of performing l simultaneous sam- [2]. Exponential variates Exp(1) are generated using the plings, each with different ri, hi, and zi. The total number U (0, 1) variates and the inverse transform method. of samples to be drawn is R = Σli=1ri. When parallelizing, the first step is to split the problem into 3.2 Data transfers smaller tasks that require minimal amount of communica- The input data required for sampling from P G(1, z) are: tion. Each task is then assigned to a single thread. When the array of all Rtask z values and the seeds for the random 10 number generator in each thread. Values z are floating-point 0.1 0.5 numbers while random seeds consist of 4 unsigned integer 2 values. The output data of the calculation on the GPU are the Rtask real-valued samples from P G(1, z). The input data l m 1 size is therefore F × R Rtask task + 4 × 4 × bytes, while the B output data size is F × Rtask bytes, where F is 8 for double 0 precision or 4 for single precision. If the input and output data size is larger than the size of global memory, execution is split in smaller steps which has a negative effect on overall −1 speedup. For a mid-range notebook GPU with 4GB of global Config memory, the execution has to be split for R II task > 227 in the speedup) 1 2 ( worst case of B = 1 and F = 8. I 10 2 log 3.3 Thread organization 1 Both modern and older GPU consist of a large number of small and simple execution units, grouped together into a smaller number of groups. These groups of units are termed 0 Streaming Multiprocessors (SM) in Nvidia and Computa- tion Units (CU) in AMD/ATI graphics processing units. Ex- −1 ecution units in a single CU/SM share more advanced logic for scheduling, caches, registers, etc. How threads get as- 2 4 6 2 4 6 log signed to these executions units, SM or CU depends on the 10( h ) thread organization. GPU threads are organized in work- groups or blocks of the same size. Each block is assigned to Figure 1: A plot of speedups (BayesLogit algorithm a single SM/CU and the threads inside the block are exe- time divided by GPU algorithm time) against size of cuted in the SM/CU execution units. h. A log-log scale is used to simplify interpretation. Dashed line indicate for which n both algorithms Threads in the same block therefore share the registers and take the same amount of time. caches. Local values for threads are stored in registers. If all the registers are used, the local data for non-active threads is cached or even stored in global memory, depending on the ment is an indirect comparison of the sampling algorithms architecture and the amount of data used. If we want to through the use in Bayesian inference. achieve optimal performance we want to avoid global mem- ory access. Beside memory access, parallelization on the ex- In the following two subsections, we describe the experi- ecution unit level and SM/CU level is the other key factor ments and report results on two different hardware con- when determining thread block size. If we organize threads figurations I (CPU: Intel Core i7-4790 @ 3.60GHz, GPU: in a large number of small blocks we can underutilize the ex- AMD Radeon HD 7970, RAM: 8GB DDR3 @ 1333 MHz) ecution units in a single SM/CU. On the other hand if we or- and II (CPU: Intel Core i7-5600U @ 2.60GHz, GPU: In- ganize threads in fewer large blocks, some SM/CU could be tel HD Graphics 5500, RAM: 12 GB DDR-3 @ 1333MHz). underutilized while threads in other SM/CU would queue. Note that thread-block (64) and batch (20) sizes were tuned for best (on average) performance on hardware configura- In our case, we can not define a constant thread-block size tion I and used for both hardware configurations. All mea- that would be optimal or even near-optimal for all possi- surements were made using the microbenchmark package [8], ble values Rtask, B and all architectures. For the exper- taking the median value of 1000 repetitions. iments done in Section 4, we fine tuned the thread-block size together with the batch size B empirically. We reduced 4.1 Generating PG random variates the search space for these two parameters based on detailed knowledge of the architectures used. We measured the running times of generating a single PG random variate (r = 1) for increasing h = 2, 4, 8, . . . across different z = {0.1, 0.5, 1.0, 2.0}. The number of samples r is 3.4 Implementation set to 1. Varying both r and h is redundant, as increasing We implemented the algorithm as part of a R package. Input either one increases the key performance variable Rtask. data checking and output data allocation was implemented in R, while the rest of the algorithm was implemented in Results are summarized in Figure 1 and show that for large- C++ and OpenCL 1.2. OpenCL was used over CUDA for enough h speedups of over 100 and approximately 30 can the purpose of being GPU-vendor agnostic. OpenCL 1.2 was be achieved on configurations I and II, respectively. Results used over 2.x in order to be compatible with more GPUs. also show that the value of z does not have a substantial influence on running times. 4. EMPIRICAL VALIDATION We performed two experiments. The first experiment is a di- For small h, running times of the GPU algorithm can be rect comparison of our parallel GPU algorithm and existing up to 10 times slower than the BayesLogit implementation CPU implementations for sampling from the PG distribu- and Rtask of over 10000 is required to achieve at least some tion from the BayesLogit package [9]. The second experi- speedup. This is due to the overhead of building and trans- 11 expected on high-end specialized GPU. Furthermore, the Table 1: Estimated running times and speed-ups measurements in our paper regarding the size of h required for a single iteration of Gibbs sampling on the to achieve speedup and maximum speedup are pessimistic, NBAplayers data set. Results are for each hardware because they include, for each sampling, the overhead for configuration (I and II) separately and broken down transferring data and code to the GPU, which would only into three parts: sampling from Polya-Gamma, sam- have to be done once. pling from multivariate normal, and matrix oper- ations. Standard deviations are included, because Additional speedup could be achieved by parallelizing the times vary from iteration to iteration. matrix operations and generating multivariate random vari- Part BayesLogit [µs] GPU [µs] Speedup ates, for which efficient GPU implementations exist. Com- PG 230423 ± 4745 5217 ± 120 44 ± 1 bining these into fully GPU-parallelized multinomial and N I k 50 ± 2 50 ± 1 - count models is delegated to future work. Another direc- matrix 3212 ± 133 3107 ± 83 - tion for future work is automatically tuning the two paral- total 233685 ± 4790 8435 ± 160 27 ± 0 lelization parameters (thread-block size, batch size) to the PG 127482 ± 2525 6793 ± 112 18 ± 0 architecture of the GPU and Rtask. This can be done either N II k 53 ± 1 54 ± 1 - offline, learning from performance on known configurations, matrix 3969 ± 9 3996 ± 42 - or online, gradually adapting the parameters based on per- total 131504 ± 2522 10844 ± 101 12 ± 0 formance, which is particularly suited to Gibbs sampling, as several iterations are made with the same values of hi. ferring the OpenCL kernel and the transfers of input and 6. ACKNOWLEDGMENTS output data. The cost of the former is larger but is also This work was supported by the Slovenian Research Agency one-time and thus negligible for large h. (ARRS Applied Project L1-7542). 4.2 Binomial regression 7. REFERENCES We illustrate the benefits of GPU-accelerated draws from [1] C. M. Bishop. Pattern Recognition and Machine PG to Bayesian inference. We use the binomial regression Learning. Springer-Verlag New York, 2006. model from Eq. (2) to model the relationship between bas- ketball players’ features and their career three-point shoot- [2] G. E. P. Box and M. E. Muller. A note on the ing percentages. The model was implemented in R, using generation of random normal deviates. Ann. Math. core R matrix operations and Statist. , 29(2):610–611, 06 1958. MASS package for generating multivariate normal variates [10]. Zero µ [3] V. Demchik and N. Kolomoyets. QCDGPU: 0 and unit diagonal Σ Open-Source Package for Multi-GPU Monte Carlo 0 were used as prior values. Lattice Simulations. Computer Science, 1(1):13–21, The data set contains 744 NBA basketball players and, for 2014. each player, his total three points attempted and made, [4] L. Devroye. On exact simulation algorithms for some weight, height, minutes per game, and indicator variables distributions related to Jacobi theta functions. whether the player was a center or a forward. Therefore, a Statistics & Probability Letters, 79(21):2251–2259, total of 744 samples and 6 input features, including the con- 2009. stant term (intercept). The average number of three-point [5] D. Luc. Non-uniform random variate generation. NY: shots attempted (n Springer, 1986. i in the model, hi in sampling) is 322.6 (ranging from 1 to 4270), leading to Rtask = 239988. [6] M. Manssen, M. Weigel, and A. K. Hartmann. Random number generators for massively parallel Results are summarized in Table 1. Between-iteration vari- simulations on GPU. The European Physical Journal ability is relatively small, which is consistent with values Special Topics, 210(1):53–71, 2012. of z not having a strong influence on running times (the [7] G. Marsaglia. Xorshift RNGs. Journal of Statistical hi = ni are fixed for all iterations). The speedup for draws Software, 8(1):1–6, 2003. from PG (44 and 18 for configurations I and II, respectively) [8] O. Mersmann. microbenchmark: Accurate Timing is also consistent with results from the first experiment for Functions, 2015. R package version 1.4-2.1. Rtask = 239988. The example also illustrates how most of [9] N. G. Polson, J. G. Scott, and J. Windle. Bayesian the speedup in PG draws translates to total speedup, be- inference for logistic models using Pólya–Gamma cause the PG part represents the majority of the total time. latent variables. Journal of the American statistical This is true when n and number of input features k are Association, 108(504):1339–1349, 2013. small relative to Rtask, which is common in the practice of [10] W. N. Venables and B. D. Ripley. Modern Applied binomial, multinomial, and count regressions. Statistics with S. Springer, New York, fourth edition, 2002. ISBN 0-387-95457-0. 5. CONCLUSIONS [11] J. Windle, N. G. Polson, and J. G. Scott. Sampling In this paper we have parallelized the process of generat- polya-gamma random variates: alternate and ing Polya-Gamma random variates on GPU. Speedup were approximate techniques. arXiv preprint of the order of 100 on a mid-range non-specialized GPU, arXiv:1405.0506, 2014. which already makes it very useful both as a standalone sam- pler and as part of a Gibbs sampling scheme for Bayesian multinomial and count models. Higher speedups can be 12 Monitoring and Management of Physical, Mental and Environmental Stress at Work Božidara Cvetković, Martin Michał Kosiedowski Mitja Luštek Gjoreski, Martin Frešer Poznań Supercomputing and Jožef Stefan Institute Jožef Stefan Institute Networking Center Department of Intelligent Systems Department of Intelligent Systems ul. Jana Pawla II 10, Poznań, Poland Jamova cesta 39, Ljubljana, Slovenia Jamova cesta 39, Ljubljana, Slovenia kat@man.poznan.pl mitja.lustrek@ijs.si boza.cvetkovic@ijs.si ABSTRACT health and comfort problems, and is most commonly linked with In this paper, we present the design of the Fit4Work system for air quality. It has been shown that in addition to health related monitoring and management of physical, mental and problems, the bad environmental conditions can cause decreased environmental stress at work. In addition, preliminary results of productivity. Inappropriate temperature decreases the productivity the monitoring modules are presented. The goal was to design a by 10%, humidity by 5% and air pollutants by 6-9%, which can low-cost system that can use commercial sensors to monitor result in a higher level of mental stress due to productivity several aspects of the users’ lifestyle and to provide pressure [1][2][3]. The objective of the Fit4Work system [4] is to recommendations for improving it. The system is designed for detect sedentary, stressful and unhealthy environment and provide older workers who are subject to sedentary stressful work in an information in the form of recommendation to the user. office environment. The preliminary results show that the system The smartphone industry is constantly developing and with each can sufficiently detect sedentary, stressful and unhealthy new product the difference between the computing power of a environment and provide information in the form of computer and a smartphone is being blurred. Today’s average recommendation to the user. smartphone is equipped with sensors for motion monitoring (e.g., accelerometers, gyroscopes, GPS, etc.), ambiance monitoring Categories and Subject Descriptors (e.g., microphones, light, etc.) and with each next generation this D.3.3 [Human-centered computing]: Ubiquitous and mobile list is extended. If we take into account that by the end of 2017, computing – ambient intelligence, mobile computing, ubiquitous over a third of the world population is projected to own a computing smartphone (2.6 billion users) it makes it the most appropriate device for mobile sensing and computing. Furthermore, with increased availability of commercial sensors and even dedicated Keywords devices that can connect with a smartphone and sense Physical activity monitoring, mental stress detection, environment environmental parameters or physiological parameters of a quality management, wearable sensors, ambient sensors person, the possibility of applications has become almost infinite. Monitoring of persons’ physical activity is not new and is in fact 1. INTRODUCTION very popular in terms of number of smartphone applications, The pace of life has been increasing with each day in the dedicated devices and even smartwatch applications already developed world, resulting in a lot of health risk factors. Most of available on the market. Smartphone only applications for activity the active population spends large amount of time at work in monitoring use smartphone accelerometer to estimate the burned closed environments under productivity pressure, which calories and amount of movement based on the number of steps contributes to different aspects of stress, such as unhealthy the user takes over a day. Dedicated devices such as Microsoft amount of sedentary activity, high mental stress and in many cases Band 2 [6] use simplified activity recognition and machine- low quality of the environment. If such lifestyle is maintained learning to improve the estimated calorie burn. Apple Watch [7] over longer period of time and even becomes a regular lifestyle of goes one step further and contains hourly reminders if the person a person it can severely contribute to development of other is sedentary and provides feedback and motivation to reach the behavioral changes and chronical or even deadly diseases. daily goal. However, if the user wants to be monitored accurately Physical stress or physical inactivity is the fourth leading cause of it is required to input the current activity which is being death worldwide. It contributes to increased mental stress, performed. The applications and devices available on the market development of cardiovascular diseases, diabetes, obesity and are satisfactory for most of the active users who want to track their other unhealthy behavioral changes. Exposure to mental stress at activities such as running or just to get an insight into their the workplace is not necessarily a negative thing since right lifestyle. If the application/device also provides a feedback and a amount of stress is needed for better productivity, however, motivation the price gets too high, thus being too expensive for an prolonged exposition to stress can result in chronical stress which average person. is a trigger for slower body recovery, other mental illnesses and Monitoring mental stress using commercial and unobtrusive vulnerability to infections due to decreased immune system. devices is relatively new and challenging research topic. Healey Environmental stress is related to overstaying in the environment and Picard [8] were first to show that the stress can be detected with low quality conditions. The term Sick Building Syndrome using physiological sensors which required intrusive wires and (SBS) is used for buildings that cause the occupants various electrodes. Hovsepian et al. [9] proposed cStress which 13 continuously monitors stress level using an ECG sensor, and were utilizes data from the commercial weather station NetAtmo to looking forward to use smartwatches in the future. With the detect whether the quality of the environment has decreased. Each advancement of the technological devices equipped with of the data analysis components has its own recommendation physiological sensors, such as Empatica [10] and Microsoft Band component which utilizes results from the data analysis to 2 [6], these obtrusive methods could finally be implemented for generate recommendation. The generated recommendations are everyday use. Various studies exist, where researchers combined sent back to the users’ smartphone. signal processing and machine-learning to implement stress detection. The studies were utilizing different sources of signals, obtrusive sensors such as camera, microphone or unobtrusive wearables [11] such as wrist device which is acceptable for most of the people. The monitoring and control of indoor environment parameters is a popular topic in smart–building research where they mainly focus on the trade-off between the occupants’ comfort in terms of environment quality, and energy consumption [12]. In this research it is prerequisite to have an automated building and fully equipped rooms with HVAC systems which can be a substantial investment and mainly indicates that most of the people living in Figure 1. Fit4Work system for monitoring and management of older buildings will not be able to use such system. To develop a physical, mental and environmental stress overview. system which does not need very expensive equipment one must develop several virtual sensors for detection/estimation of parameters that cannot be sensed directly and develop prediction 2.1 Physical Activities models for environmental parameters. A correlation between the The goal of the physical activity module is to recognize current environmental values and the occupancy state or window state has state of the user is terms of activity and expended energy. been reported in previous research [13] which proves that it can Additionally, the output of the physical activity module is be modeled. Estimation of the number of occupants was aggregated and used for generating recommendations which will successfully achieved with using the data of single or multiple guide the user to successfully achieve daily and weekly activity environmental parameters [14]. We did not come across any goals. relevant methods for window state detection. The prediction of the dynamics of the parameters is usually done with the predictive control technique [15] which requires the user to describe the environment in detail (e.g., material of walls, windows, etc.) which is not very user friendly. The Fit4Work system will utilize sensors from an accelerometer equipped smartphone, Microsoft Band 2 and a commercial weather station NetAtmo [16] to monitor and detect increased physical, mental and environmental stress at workplace and provide recommendations through the dedicated smartphone application which will assist in management of the stresses. To be able to successfully achieve this goal we designed a system which utilizes multiple machine-learning models to monitor physical activity, mental stress of the user and predict the state of the workplace and predict future dynamics of the environmental Figure 2. Workflow of the physical activity monitoring data parameters. analysis component The paper presents the design of the final Fit4Work system for The physical activities method is composed of six tasks and is monitoring and management of physical, mental and presented in Figure 2. The devices we use are the accelerometer environmental stress, and refers to preliminary results of the equipped smartphone and the Microsoft Band 2 wristband capable modules which were obtained independently of other modules. of measuring acceleration and physiological signals such as heart rate, galvanic skin response, skin temperature and R-R interval. 2. FIT4WORK SYSTEM OVERWIEW First, we check whether each of the two devices is present on the The Fit4Work system for monitoring and management of user’s body. Second, if the smartphone or wristband is on the physical, mental and environmental stress is composed of three body, we wait for a walking period. Walking is detected with a data analysis and recommendation components and is presented in machine-learning model in a location- and orientation- Figure 1. The physical activity monitoring utilizes data from the independent manner. The walking signal is a prerequisite for smartphone accelerometer and Microsoft Band 2 to recognize the normalizing the orientation of the smartphone and the wristband current activity of the user and to estimate the energy expenditure and for recognizing the location of the smartphone. Third, when a while performing the recognized activity. The mental stress 10-second period of walking is detected, we normalize the monitoring utilizes data from the Microsoft Band 2 and physical orientation and detect the location of the smartphone utilising activity as recognized by the physical activities module to detect location machine-learning model. If the smartphone ceases to be the level of mental stress. The quality of the environment module on the body because the user has taken it out of the pocket or bag, the information on the orientation and location is no longer valid 14 and we have to wait until the next period of walking. Fourth, distinguish between true stress and the many situations which depending on the devices which are on the body and the location induce a similar physiological arousal (e.g., exercise, eating, hot in which the smartphone is, we invoke the appropriate weather, etc.). By introducing the context-based classifier we can classification model for activity recognition (AR) and regression provide more information about the real-life circumstances and model for estimating user's energy expenditure (EEE). For more the user, thus to improve the detection performance. information the reader is referred to [17]. The preliminary experiment of the module was done using the In this paper we present preliminary results for AR (lying, Empatica wristband and the results are presented in Table 1. For walking, running, standing, sitting, cycling, mixed) and EEE with more details the reader is referred to [21]. the wristband only if the wristband is present and smartphone Table 2. Confusion matrix for leave one user out evaluation of only if wristband is not present. The results are presented in Table the mental stress detection module. 1. The experiments were performed on the data obtained with Empatica in scenario presented in [17]. The results of AR is No Stress Stress presented in terms of accuracy and the EEE in terms of mean No Stress 790 23 absolute error (MAE). The results of the EEE are compared Stress 51 63 against the commercial device SenseWear [19], one of the most accurate consumer devices for EEE. Future work includes investigating the relation between labelled stress level, recognized stress level and cortisol levels and Table 1. Preliminary results of the activity recognition and personalization of the models. estimation of energy expenditure. 2.3 Quality of the Environment Smartphone Monitoring of the quality of the environment detects worsening of Wristband Trousers Torso Bags SenseWear measured parameters and recommends appropriate actions which AR [%] 88.4 87.9 78.9 79.5 / will return to or keep the parameters in optimal range. The module EEE [MET] 0.87 0.87 0.92 1.15 1.0 for monitoring and management of the quality of the environment Future work includes use of both devices if both are present as is composed of three components - a sensing component, an done in our previous work [17]. ontology with a reasoner, and a simulator, as presented in Figure 4. 2.2 Mental Stress The sensing component is composed of hardware sensors and The stress monitoring module continuously monitors user’s stress virtual sensors. The hardware sensors are the real sensors which and provides overview of stressful events to the user. measure the environmental parameters such as temperature, Additionally, the output of the stress-monitoring module is used humidity, CO2, noise, etc., while virtual sensors use machine – by a stress-relief module for suggesting appropriate exercises at learning models on the raw parameter values to estimate room- appropriate time. specific properties that cannot be sensed directly such as For the stress-monitoring module we developed a machine estimating the number of occupants and detecting the state of the learning method which is applied on data collected via sensor- devices. The outputs of the sensing component are fed into the equipped device worn on the wrist, Microsoft Band 2. The ontology, from which the reasoner infers actions that can improve method for detection of mental stress is presented in Figure 3 and the state of the environment, based on the current state and consists of three machine-learning components (a base stress present devices. The list of actions returned by the ontology is fed detector, an activity recognizer and a context-base stress detector). into the simulator. The simulator is composed of prediction models and the quality rating module (Q–rating) which predicts the environmental values for all combination of actions and evaluates the resulting parameter states (values range between -1 and 1). The action resulting in the best state is finally recommended by the system. The reader is referred to [22] for more details. Figure 3. Method for mental stress detection. The laboratory stress detector is a machine-learning classifier trained on laboratory data to distinguish stressful vs. non-stressful events in 4-minute data windows with a 2-minute overlap. As input it uses features computed from the physiological signals (blood volume pulse, heart rate, R-R intervals, skin temperature and electrodermal activity), and it outputs a prediction for Figure 4. Method for monitoring and management of the possible stress event. In addition, the activity is retrieved from the quality of the environment. physical activity monitoring module and provides a context The preliminary results on the winter data are shown in Table 3. information for the context-based stress detector. The context- First two rows present the results of the virtual sensors, one for based stress detector aggregates the predictions of the laboratory detecting the window state and the second for estimating number stress detector, uses context information and provides a prediction of occupants, last three rows present the results of the prediction every 20 minutes. The interval of 20 minutes was chosen models. The result for the classification is presented in terms of empirically. We decided to use the context-based classifier to 15 accuracy and results for the regression in terms of mean absolute [5] Smartphone statistics, error (MAE) and root mean squared error (RMSE). https://www.statista.com/topics/840/smartphones/ The future work includes development of the virtual sensors for [6] Microsoft Band 2, https://www.microsoft.com/ other devices, such as humidifier, air conditioning etc. and [7] Apple Watch, http://www.apple.com/watch/ improvement of the prediction models by using additionally collected data. [8] Healey J. A., Picard R. W. 2005. Detecting Stress During Real-World Driving Tasks Using Physiological Sensors. Table 3. Results of the machine-learning models used for IEEE Trans. Intell. Transp. Syst., vol. 6, no. 2, pp. 156–166. virtual sensors (window state and number of occupants) and for prediction of environmental parameters. [9] Hovsepian, K., Absi, M., Kamarck, T., Nakajima, M. 2015. cStress : Towards a Gold Standard for Continuous Stress ACC MAE RMSE Assessment in the Mobile Environment. In the Proceedings E V E V E V of the 2015 ACM UbiComp, pp. 493-504, 2015. Window state [%] 91 81 ✗ ✗ ✗ ✗ [10] Empatica, https://www.empatica.com/ No. of occupants ✗ ✗ 0.6 0.5 1.2 0.8 Predict T [℃] ✗ ✗ 0.4 0.5 0.5 0.6 [11] Muaremi, A., Bexheti, A., Gravenhorst, F., Arnrich, B., Predict H [%] ✗ ✗ 0.6 0.3 0.9 0.5 Tröster, G. 2014. Monitoring the Impact of Stress on the Predict CO Sleep Patterns of Pilgrims using Wearable Sensors. IEEE- 2 [ppm] ✗ ✗ 55 45 104 61 EMBS Int. Conf. Biomed. Heal. Informatics, pp. 3–6. 3. CONCLUSION [12] Kumar, P., Martani, C., Morawska, L., Norford, L., We present a work-in-progress system for monitoring and Choudhary, R., Bell, M., Leach, M. 2016. Indoor air quality management of physical, mental and environmental stress which and energy management through real-time sensing in utilizes machine-learning models on the data from commercial commercial buildings. Energy and Buildings 111, 145–153. devices (accelerometer equipped smartphone, Microsoft Band 2 [13] Konuah, A. B. 2014. Occupant window opening behaviour: and NetAtmo weather station) to detect the state of the user and the relative importance of temperature and carbon dioxide in recommend appropriate actions to improve users physical, mental university office buildings. Ph.D. Dissertation. University of and environmental state. Sheffield. The goal of the Fit4Work project is to develop a low cost system [14] Candanedo, L. M., Feldheim, V. 2016. Accurate occupancy which can be used in any environment (without the need to detection of an office room from light, temperature, humidity renovate or automate the environment). The preliminary results of and CO 2 measurements using statistical learning models. each module show that selected devices in combination with Energy and Buildings 112, 28–39. intelligent algorithms can sufficiently detect sedentary, stressful [15] Kolokotsa, D., Pouliezos, A., Stavrakakis, G., Lazos. C. and unhealthy environment and provide information in the form 2009. Predictive control techniques for energy and indoor of recommendations to the user. environmental quality management in buildings. Building In the future, we plan to integrate the three modules into a single and Environment 44, 9, 1850–1863. system in form of a smartphone application and develop first [16] NetAtmo, https://www.netatmo.com/ recommendation prototypes. We will upgrade the machine- [17] Cvetković, B., Milić R., and Luštrek, M. 2016. Estimating learning models for recognition of the activities and estimation of Energy Expenditure With Multiple Models Using Different energy expenditure to utilize both devices (the smartphone and Wearable Sensors, in IEEE Journal of Biomedical and Health the wristband) as seen in our previous work, personalize the stress Informatics, vol. 20, no. 4, pp. 1081-1087. detection method and upgrade its machine-learning models to utilize other contextual information and enhance the environment [18] Cvetković, B., Janko, V., Luštrek, M. 2015. Demo abstract: quality monitoring with additional machine-learning models. Activity recognition and human energy expenditure estimation with a smartphone, PerCom 2015, pp. 193-195. 4. ACKNOWLEDGMENTS [19] SenseWear, https://templehealthcare.wordpress.com/the- This research was conducted within the Fit4work project co- sensewear-armband/ financed through the AAL Programme. [20] Palshikar, G.K. 2009. Simple Algorithms for Peak Detection in Time-Series”. Ii the Proc. 1st Int. Conf. Advanced Data 5. REFERENCES Analysis, Business Analytics and Intelligence, June. [1] Kohl, H. W., Craig, C. L., Lambert, E. V., Inoue, S., [21] Gjoreski, M., Gjoreski, H., Luštrek, M., Gams. M. 2016. Alkandari, J. R., Leetongin, G., Kahlmeier, S. 2012. The Continuous stress detection using a wrist device: in pandemic of physical inactivity: global action for public laboratory and real life. In Proceedings of the 2016 ACM health. The Lancet, vol. 380, no. 9838, pp. 294–305, 2012. UbiComp '16 [2] WHO Mental Stress, [22] Frešer, M., Cvetković, B., Gradišek, A., and Luštrek, M. http://www.who.int/occupational_health/topics/stressatwp/ 2016. Anticipatory system for T--H--C dynamics in room with real and virtual sensors. In Proceedings of the 2016 [3] Lan, L., Wargocki, P., Lian. Z. 2012. Optimal thermal ACM UbiComp '16. environment improves performance of office work. REHVA European HVAC Journal (January 2012), 12–17 [4] Fit4Work, http://www.fit4work-aal.eu/ 16 Bayesian binary and ordinal regression with structured uncertainty in the inputs Aleksandar Dimitriev Erik Štrumbelj University of Ljubljana University of Ljubljana Faculty of Computer and Information Science Faculty of Computer and Information Science Večna pot 113, Ljubljana, Slovenia Večna pot 113, Ljubljana, Slovenia ad7414@student.uni-lj.si erik.strumbelj@fri.uni-lj.si ABSTRACT that forecasting accuracy can be improved by first modelling We propose a novel approach to binary and ordinal predic- the uncertainty in the inputs and second taking this uncer- tion with structured uncertainty in the input variables. It tainty into account in the model. In this paper we focus is based on efficiently approximating the prediction model on developing an approach for the the second task, while conditional on the inputs and then marginalizing the con- addressing the first in a limited case. ditional model over the input space using Monte Carlo ap- proximation. For efficiency, the well-known Laplace approx- In its essence, the problem is similar to modelling with mea- imation is used for the binary case and we derive a similar surement error, where input variables X (in our case, aggre- approximation for the ordinal case. Empirical evaluation on gated counts) are measured with error. This has received sports data shows that the proposed approach substantially a lot of attention [2, 3]. Measurement error models can improves forecasting accuracy and highlights the severity of roughly be split into two groups: functional or structural the problem of uncertainty in the input variables in sports. modelling [2]. Functional models make no or very few as- sumptions about the distribution of X, while in structured Keywords models, a model is placed on X, typically a parametric one. Laplace approximation, sampling, sports, inference The difference between our case and a typical measurement error settings is that sports teams recur, so training samples 1. INTRODUCTION are not independent or identically distributed, but we have This work is motivated by a problem from sports forecasting multiple measurements of the same input variable. This al- [9] - predicting future outcomes from past performances. To lows us to use a structural approach as described in section illustrate the problem, imagine we are interested in predict- 3.1. However, we also need a model that can handle uncer- ing the outcome of a basketball game between two teams. tain inputs and this is the focus of the paper. In Section 2, Typically, we would first compile a set of past games with we propose a Bayesian approach to model the relationship outcomes and relevant performance-related variables, which between the noisy inputs and the outputs and incorporate are typically count variables, such as shots made/missed, re- uncertainty associated with the inputs in the form of prior bounds, etc... Then we aggregate these variables across past information. The empirical evaluation, data, and results are performances and possibly transform them into variables described in Section 3. With Section 4 we conclude the pa- that are known to be good descriptors of team quality and per and offer directions for further work. predictors of future performance, for example, shooting per- centage (total shots made divided by total shots attempted). 2. METHODS Finally, we would use some binary regression model to model The standard prediction setting is the following: we are the relationship between the inputs and the outcomes. given a dataset X ∈ Rn×m, which consists of n observa- tions and m input variables, and a target variable y ∈ Rn. What has so far not been taken into account in related work We then train a model that can produce predictions y∗ for is that past performance variables are noisy and that the a given new x∗ by using the training data X. We adopt noise is transferred to any input variables we derive from the Bayesian approach p(β|X, y) ∝ p(X, y|β)p(β), where β them. Fitting a model and failing to account for this uncer- are the parameters of the model. For the prediction setting, tainty will result in being overconfident in the parameters of we obtain the probability by marginalizing over the model the model and subsequently the forecasts. We hypothesize parameters: Z p(y∗|x∗, X, y) = p(y∗|β, x∗)p(β|X, y) dβ. In the above, the data are treated as constant. A more gen- eral problem arises in our setting, where we instead treat the inputs X and x∗ as random variables, with densities p(X|w) and p(x∗|w∗), where w and w∗ are known constants. In this case, to train a model and obtain its posterior distribution p(β|w, y) we must marginalize over the input space of pos- 17 sible training data sets X: for j ∈ {1, .., k − 1}, where β and αj are parameters. For Z Z convenience, let α0 = −∞ and αk = +∞. The outcome p(β|w, y) = p(β, X|w, y) dX = p(β|X, y)p(X|w) dX. probabilities can then be written as Similarly, to obtain a prediction for y∗ we must not only P (Y = j|x) = P (Y ≤ j|x) − P (Y ≤ j −1|x) marginalize over the parameters β, but also over the distri- = σ(βx+αj)−σ(βx+αj−1), for j ∈ {1, .., k}, bution of the test sample x∗ as follows: where σ is again the inverse logit function. Z Z p(y∗|w,w∗,y) = p(y∗|x∗, β)p(x∗|w∗)p(β|w, y) dx∗ dβ x The proportional odds model and its generalizations (see [7]) ∗ β (1) have received very little attention in the Bayesian setting. This approach is general, since it can be applied with any We now derive the Laplace approximation to the posterior model that produces a posterior probability distribution over of this model. First, we place priors on the parameters. its parameters p(β|X) and a distribution over its predictions To ensure in-order intercepts, we introduce parameters dj, p(y∗|x∗, X) for a given test sample x∗. What remains for the j ∈ {1, .., k −1} and a stick-breaking re-parametrization of model to be fully specified is defining the distributions that the k−1 parameters aj with aj = Pj = d i=1 i. We place flat generate the training and test data sets. The integral in priors on the parameters p(di) ∝ 1 and all di are restricted Eq. (1) will generally be intractable even for the simplest of to be positive, except for d1. As in the binary case, we place Bayesian models and is typically approximated using Monte normal priors on the coefficients β1, ..., βm ∼ N (0, σβ). The Carlo approximation E[y∗|w∗, w, y] ≈ 1 PN y∗ , where model’s likelihood is N i=1 (i) y∗ is a random sample from the posterior predictive distri- (i) n k m (βi)2 bution in Eq. (1) and can be obtained by sequentially sam- Y Y Y p(β, d|D) ∝ (R e 2σ2β , pling X i,j − Ri,j−1 )(yi=j) (i) from p(X |w), β(i) from p(β|X = X(i) , y), x∗ (i) from i=1 j=1 i=1 p(x∗|w∗), and finally, y∗(i) from p(y∗|x∗ = x∗(i), β = β(i)). where Ri,j = σ(βxi + αj), and the log-likelihood The densities p(X|w), p(x∗|w∗) represent our structural mea- n k surement error model and are in most practical cases easy 1 X X L = C − (β ◦ β)+ I(y to sample from efficiently. The densities p(β|X = X i = j) log(Ri,j −Ri,j−1), (i), y) 2σ2β i=1 j=1 and p(y∗|x∗ = x∗(i), β = β(i)) are the posterior and posterior predictive for the selected prediction model, conditional on where I is the indicator function and C is a constant. The the inputs being fixed, and we have to be able to efficiently gradient of the log-likelihood cannot be written as succinctly sample from them. This implies that we need an efficient as is the case with logistic regression. Instead, we start with prediction model or, in the case of Bayesian models, which the derivative for some parameter θ: are typically computationally intensive, an efficient struc- n k tural approximation to p(β|X = X ∂ 1 ∂ ∂ (i), y). X X L = − (β ◦ β) + L ∂θ 2σ2 ∂θ ∂θ i,j , β i=1 j=1 2.1 Approximate logistic regression Bayesian logistic regression is a discriminative model that where ∂ L ∂θ i,j = 0 if yi 6= j and for a given example (x, y), y ∈ {0, 1}n, yields a probabilistic prediction: P (y = 1|x) = σ(βT x), where σ(x) = 1/(1 + e−x) ∂ 1 ∂ ∂ Li,j = ( Ri,j − Ri,j−1) is the logistic function and β is the parameter vector of the ∂θ Ri,j − Ri,j ∂θ ∂θ −1 model. The typically-used prior is β ∼ N (µ 0, Σ0 ). 1 ∂ = R (βx R i,j (1 − Ri,j ) i + αj ) i,j − Ri,j ∂θ −1 A closed-form expression does not exist for neither its poste- rior nor its predictive distribution. For efficiency, we use the ∂ − Ri,j (βx , −1 (1 − Ri,j−1 ) i + αj−1) well-known Laplace approximation (see [1], pages 213-215). ∂θ The approximation to the posterior is the following Gaus- otherwise. The derivation of the Hessian is omitted. It had sian p(β|X, y) ∼ N (qmap, (Σ−1 + XT RX)−1), where R is a 0 little effect on the accuracy and running times - all results diagonal matrix with entries rii = σ(qT mapxi)(1 − σ(qT mapxi)) reported in the empirical evaluation were obtained using a and qmap is the posterior maximum, which can be obtained numerical approximation to the Hessian. with gradient methods. 3. EMPIRICAL EVALUATION 2.2 Approximate proportional-odds model We empirically evaluated our approach on (a) forecasting We now derive a Laplace approximation to the proportional- basketball game outcomes, which always have a winner and odds model (ordinal logistic regression), which is the most can be treated as a binary regression task, and (b) football commonly used model for the ordinal setting [6]. Let n and match outcomes, where draws are also possible, making it m again be the be the number of samples, and input vari- an ordinal regression task. ables, respectively and k the number of (ordered) categories. The model is based on the assumption that the odds of all In both cases, the count data are first preprocessed to obtain binary decisions between categories are proportional to each uncertainty in the input variables, which is described in Sec- other or, equivalently, that the k−1 logit surfaces are parallel: tion 3.1. As a baseline for comparison, binary logistic regres- P (Y ≤ j|x) sion and ordered logistic regression are included, using mean logit(P (Y ≤ j|x)) = log = βx + α P (Y > j|x) j , counts. We also include the baseline with noisy test cases. 18 0.000 0.00 -0.002 -0.01 -0.004 -0.02 mean relative squared error (+/- SE) mean relative rank probability score (+/- SE) logistic reg. logistic reg. + noise proposed (HMC) proposed (Laplace) -0.03 model ENG FRA GER ITA SPA Figure 1: Estimated prediction errors across all competition train-test season pairs in the NBA basketball data ordinal reg. ordinal reg. + noise proposed (Laplace) set. All errors are relative to the baseline for com- parison (logistic regression). Figure 2: Estimated prediction errors for each foot- This is obtained by treating the test input variables as ran- ball league and across all train-test season pairs in dom, using the same structural model for the inputs as for that league. All errors are relative to the baseline the proposed model, and approximating the expected pre- for comparison (ordial regression) diction of the baseline models with Monte Carlo sampling. For the binary case we also include the proposed model with- out marginalization - jointly modelling β and X, including A ∼ Gamma(α, θ) and B ∼ Gamma(β, θ), then X = A A+B the uncertainty in the form of an informative prior on X. is distributed X ∼ Beta(α, β).. Therefore, for a ratio vari- We implement this model in the probabilistic programming able derived from Poisson posterior rates of the form R = P language and tool for Bayesian inference Stan [8]. i λA,i P P is Beta distributed R|¯ λ λ A,i, ¯ λB,i ∼ Beta(P ¯ λA,i, P ¯ λB,i). A,i + λB,i The scale parameters for the θ are always the same, since the Our empirical evaluation procedure is a straightforward mea- second parameter in the Gamma distribution corresponds to surement of out-of-sample forecasting accuracy, while re- the number of games played, and it is the same for all counts specting the time line. We use train-test season pairs, train- at a particular point in time. ing on one season and forecasting on the next. Only data available prior to a match are used in forecasting it. We mea- sure forecasting accuracy with mean squared error (MSE) in 3.2 Basketball data the binary case and the rank probability score (RPS) in the The basketball data used in our experiments consist of all ordinal case [4]. the regular season and play-off games in the past 13 seasons (12 train-test season pairs) of the National Basketball Asso- 3.1 Preprocessing count data ciation (NBA) from 2001/02 to 2013/14. The data were ob- tained from http://www.basketball-reference.com/. The Input variables that are used as predictors in sports are typ- count variables included in the data are counts of two-point ically count variables or ratios of count variables, in particu- shots made and missed, three-point shots made and missed, lar ratios of the form A where A and B are sums of count A+B turnovers, offensive rebounds and defensive rebounds. We variables. We will assume that the count variables follow in- use these counts indirectly by transforming them into 8 ra- dependent Poisson distributions and are time-homogeneous. tios, described in [10], which are known to be good predictors While a more sophisticated model for measurement error of basketball match outcomes. Note that these ratios are could be used, we focus on illustrating the benefits of taking based on the well-known four factors effective field goal per- into account the uncertainty in the inputs. centage, rebounding percentage, turnover percentage, and free throw percentage [5]. A natural choice for the prior distribution of the rate param- eter λ is the Gamma distribution λ ∼ Gamma(a0, b0), which is conjugate. Therefore, for each count variable with mean 3.3 Football data ¯ λi over ni games, the posterior is again Gamma λi| ¯ λi, ni ∼ Our football data set consists of 5 complete seasons (2010/11 Gamma(a0 + λini, b + ni), where we select weakly informa- - 2014/15) of each of the top 5 European national club com- tive priors a0 = b0 = 0.001. A sum of Gamma distributed petitions: English Premier League, French Ligue 1, Ger- random variables with the same scale is again Gamma dis- man Bundesliga, Italian Serie A, and Spanish La Primera. tributed with same scale. Furthermore, if A and B are That is, 4 train-test season pairs for each of the 5 leagues Gamma distributed random variables with the same scale for a total of 20. In addition to the outcome, we include, 19 ENG FRA GER ITA SPA 0.28 0.24 0.20 0.16 rank probability score (smoothed) 0.12 0 100 200 300 0 100 200 300 0 100 200 300 0 100 200 300 0 100 200 300 season progression (number of matches played) ordinal reg. ordinal reg. + noise proposed (Laplace) Figure 3: Prediction errors over the course of a season, averaged across all seasons and for each football competition separately. for each match and each of the two teams that partici- certainty is highest. This makes the proposed approach a pated in the match, the number of goals scored, shots and valuable contribution to the sports forecasting toolbox, but shots on target, corners, fouls committed, and yellow and it can also be applied in other similar domains. In this pa- red cards received. The data were obtained from http: per we focused on prediction with structured uncertainty in //football-data.co.uk/data.php. the inputs and put less emphasis on the estimation of the uncertainty. Future work could entail developing a model 3.4 Basketball results that allows the Poisson rates to vary over time and takes The structural approximation variant of the proposed model into account the possible correlations between them. outperforms all other models (see Figure 1). These differ- ences are not only statistically discernible, but also practi- 5. REFERENCES cally relevant (see [10] and references therein). Although [1] C. M. Bishop. Pattern recognition and machine the HMC-based approximation yields relatively good pre- learning. springer, 2006. dictions, it is discernibly worse than the structural approx- [2] R. J. Carroll, D. Ruppert, L. A. Stefanski, and C. M. imation. This can, at least in part, be explained by the Crainiceanu. Measurement error in nonlinear models: inferior accuracy of the HMC-based approximation due to a modern perspective. CRC press, 2006. slow mixing (effective sample sizes for β were, on average, [3] W. A. Fuller. Measurement error models, volume 305. ∼ 20% of the total number of iterations and at 1000 itera- John Wiley & Sons, 2009. tions, the estimated MCMC sampling error was, on average, [4] T. Gneiting and A. E. Raftery. Strictly proper scoring approximately 10% of posterior standard deviation). rules, prediction, and estimation. Journal of the American Statistical Association, 102(477):359–378, 3.5 Football results 2007. Football results are similar to basketball results, however, [5] J. Kubatko, D. Oliver, K. Pelton, and D. T. the proposed model outperforms the other models even more Rosenbaum. A starting point for analyzing basketball convincingly across all football competitions (see Figure 2). statistics. Journal of Quantitative Analysis in Sports, Compared to NBA basketball, football seasons are much 3(3), 2007. shorter (in terms of matches per team) and there is more [6] P. McCullagh. Regression models for ordinal data. uncertainty in the inputs derived from match statistics. As Journal of the royal statistical society. Series B anticipated, the proposed model excels at the beginning of (Methodological), pages 109–142, 1980. each season and the differences between models’ prediction [7] B. Peterson and F. E. Harrell Jr. Partial proportional errors decrease as the season progresses and input variables odds models for ordinal response variables. Applied become more certain (see Figure 3). Similar results were statistics, pages 205–217, 1990. observed for basketball, but are omitted for brevity. Note [8] Stan developement team. Stan Modeling Language that the HMC variant of the proposed ordinal model was Users Guide and Reference Manual, Version 2.8.0, not included in the comparison on football data, because 2015. the computation times make it infeasible for practical use. [9] H. O. Stekler, D. Sendor, and R. Verlander. Issues in sports forecasting. International Journal of 4. CONCLUSION Forecasting, 26(3):606–621, 2010. Our hypothesis that ignoring the uncertainty in the input [10] E. Štrumbelj and P. Vračar. Simulating a basketball variables in sports will lead to less accurate predictions was match with a homogeneous Markov model and correct. The proposed model substantially outperformed forecasting the outcome. International Journal of typically-used models. As expected, the improvement is Forecasting, 28(2):532–542, 2012. most substantial at the beginning of the season where un- 20 Multiobjective discovery of driving strategies using the SCANeR Studio Erik Dovgan Artificial Intelligence Laboratory Department of Intelligent Systems Faculty of Computer and Information Science Jožef Stefan Institute University of Ljubljana Jamova cesta 39, SI-1000 Ljubljana, Slovenia Večna pot 113, SI-1000 Ljubljana, Slovenia erik.dovgan@fri.uni-lj.si ABSTRACT the pollution reduction is proportional to the reduction of This paper presents the SCANeR Studio simulation envi- the fuel consumption. An interesting side effect of this is a ronment, which is used to design future vehicles by several reduction in traveling cost, although this is countered by an worldwide companies. In addition, this environment can be increase in traveling time. It is clear that heavily reducing used to evaluate autonomous vehicle driving algorithms, al- fuel consumption is not optimal because it leads to unaccept- though it has several limitations. This paper describes the ably long traveling time. Therefore, several objectives need main environment limitations that are relevant for the eval- to be taken into account simultaneously when constructing uation of the Multiobjective optimization algorithm for dis- a driving strategy. covering driving strategies (MODS). Finally, a set of actions for mitigating the SCANeR shortages is given. To obtain autonomous vehicle driving, real-time driving stra- tegies have to be applied by the autonomous vehicle driving Categories and Subject Descriptors algorithms. These strategies have to select the best control action at each step with respect to the current vehicle and J.7 [Computer Applications]: Computers in other sys- route state, e.g., position on the route, position of neighbor tems vehicles, velocity of neighbor vehicles etc. The best control action can be selected by implementing an algorithm that Keywords optimizes various objectives, such as the traveling time, the driving strategies, multiobjective optimization, SCANeR Stu- fuel consumption, the distance to other vehicles for colli- dio sion avoidance etc. Since several objectives have to be si- multaneously taken into account, it is preferable to use the multiobjective approach when developing the optimization 1. INTRODUCTION algorithm, since it is in general the case that multiobjec- Autonomous vehicle driving is recently being investigated tive algorithms exhibit advantages over single-objective al- by many automotive and other companies, e.g., Ford [6], gorithms (since they enable, e.g., better exploration of the Mercedes-Benz [7], Toyota [12], BMW [5], Audi [8], and multiobjective search space). Google [13]. Google’s project of autonomous vehicles is led by Sebastian Thrun, former director of the Stanford Ar- To use the autonomous vehicle driving algorithms in prac- tificial Intelligence Laboratory and co-inventor of Google tice, the proper evaluation of these algorithms is of key im- Street View, who wan the DARPA autonomous car com- portance. Since untested algorithms cannot be simply de- petition. Several driver assistance systems are already in- ployed in autonomous vehicles and tested in real environ- stalled in modern vehicles, such as line assist (see, e.g., Volk- ment, e.g., city roads, the evaluation has to be performed in swagen [15], Audi [1], and Toyota [14]). In addition, fully the near-real-life driving simulation environment. Typically, autonomous vehicles start driving in urban environments. the best existing driving simulation environments are used For example, 100 self-driving Volvo cars will drive on public by the worldwide automotive companies. This paper exam- roads around the city of Gothenburg by 2017 [16]. Till June ines the SCANeR Studio simulation environment [9], which 2015, the Google autonomous vehicles have driven over 1 is used by several worldwide companies such as Renault, million miles encountering 200,000 stop signs, 600,000 traf- PSA Peugeot Citroën and VOLVO Trucks [10]. fic lights, and 180 million other vehicles. The goal of the ongoing research is to enhance the existing The development of an autonomous driving solution usually Multiobjective optimization algorithm for discovering driv- focuses on the determination of the vehicle surroundings, ing strategies (MODS) [2, 3, 4] to take into account real-life e.g., other vehicles, obstacles, and pedestrians, in order to driving situations and to evaluate it with the SCANeR Stu- increase safety and avoid collisions. However, such a strat- dio. To this end, MODS will have to be significantly adapted egy may miss to achieve other objectives that are also impor- due to limitations of the SCANeR Studio. tant. An objective that has to be taken into account is the reduction of the environmental pollution. This objective is The paper is further organized as follows. Section 2 describes especially important since the awareness of the need to pro- the existing Multiobjective optimization algorithm for dis- tect our environment is increasing. When driving a vehicle, 21 covering driving strategies. The SCANeR Studion and its various entities in the environment. To this end, it consists limitations are presented in Section 3. Section 4 presents the of several modules such as: ongoing and future enhancements of the MODS algorithm. Finally, Section 5 concludes the paper with the summary of • work. Complex dynamic vehicle models simulating the be- havior of every component of a real vehicle 2. MULTIOBJECTIVE OPTIMIZATION AL- • Simple models of autonomous traffic vehicles simulat- GORITHM FOR DISCOVERING DRIV- ing the behavior of other vehicles on the road ING STRATEGIES • Models for simulating the behavior of pedestrians Multiobjective optimization algorithm for discovering driv- ing strategies (MODS) [2, 3, 4] is a two-level multiobjec- tive optimization algorithm that finds a set of nondominated The behavior of these models can be predefined, controlled driving strategies with respect to two conflicting objectives: online with the scripting language or managed through APIs traveling time and fuel consumption. The lower-level algo- using custom algorithms. In addition, complex vehicle mod- rithm is based on a deterministic breadth-first search, driv- els can be controlled by the driver through the acquisition ing prediction and nondominated sorting, and searches for modules that can be connected to keyboards, gaming steer- nondominated driving strategies. The search is performed ing wheels etc. Vehicle driving is presented to the driver by simulating vehicle driving by route steps. It starts with a using visual and sound modules, and the module that con- single driving strategy with no control actions. If for a route trols dynamic platforms (if present). Moreover, the models step the driving strategy does not define the control action, involved in the simulation can be controlled/executed with the driving strategy is cloned for each possible control ac- either SCANeR user interface or programs that control the tion and the obtained driving strategies are stored in the set simulation, i.e., the supervisors. of driving strategies. When the control action is defined, it is used to predict the vehicle driving for a predefined num- The SCANeR Studio enables to create a complete driving ber of prediction steps. Due to the cloning, the number of simulation environment through the usage of five dedicated driving strategies grows exponentially. To maintain a con- modes of the graphic interface: stant number of driving strategies at each route step, fast nondominated sorting is used to select the most promising • Terrain mode: Road network creator allowing the rapid driving strategies. creation of realistic road networks that are useable di- rectly in the simulation The upper-level algorithm is an evolutionary algorithm that optimizes the input parameters for the lower-level algorithm. • Vehicle mode: Tool for fine-tuning and study of dy- This algorithm applies evolutionary mechanisms, i.e., selec- namic vehicle models tion, crossover and mutation, to the sets of input-parameter values through generations and maximizes the hypervolume • Scenario mode: Driving simulator scenario editing tool covered by the driving strategies found by the lower-level • Simulation mode: Simulation supervision tool algorithm. • Analysis mode: Detailed graphical analysis tool MODS was implemented in two variants. The original MODS algorithm minimizes the traveling time and the fuel con- Although The SCANeR Studio is a powerful tool for evalu- sumption [2, 3], while the enhanced version, called MOCDS, ating autonomous vehicle driving algorithms, it introduces optimizes also the driving comfort [4]. some limitations for the development of these algorithms. When taking into account the MODS algorithm, the SCANeR Currently, the MODS algorithm is integrated and was eval- Studio limitations presented Section 3.1 are the most critical uated in a vehicle driving simulator that includes data from ones. real-world routes and a black-box vehicle simulator. The re- sults show that MODS finds better driving strategies than existing algorithms for discovering driving strategies. How- 3.1 Limitations of the SCANeR Studio ever, MODS has some limitations. For example, it does not Limited implementation of complex dynamic vehicle mo- take into account neighbor/other vehicles on the route, it dels does not produce human-like driving strategies, and it does The SCANeR Studio contains several types of complex dy- not find (good) driving strategies in real time. To over- namic vehicle models such as cars, trucks and buses. The come these shortages, the ongoing research consists of de- instances of these models can be controlled only with throt- sign and implementation of the enhanced MODS algorithm, tle, braking and clutch pedals, gearbox, steering wheel etc. and its deployment in the near-real-life environment, i.e., On the other hand, if an autonomous vehicle driving algo- the SCANeR Studio simulation environment. rithm controls the vehicle by setting vehicle’s velocity, such algorithm cannot be evaluated with complex vehicle models. 3. SCANeR STUDIO SIMULATION ENVI- RONMENT The SCANeR Studio contains also a large set of various The SCANeR Studio [9] is a modular driving simulation simple vehicle models. These models define some vehicle environment, developed by the OKTAL company [11]. It parameters such as length, weight etc., but do not simu- simulates the vehicle driving behavior and the behavior of late engine behavior in details. For example, they do not 22 simulate fuel consumption or engine limitations. Since no The main issue of such behavior is the fact that the simu- engine limitation is simulated, these simple models enable lation results are not deterministic, i.e., the same scenario to instantaneously change the vehicle velocity without lim- with the same SCANeR modules does not always produce itations, which results in unrealistic driving behavior. The the same driving behavior and consequently the same travel- advantage of these algorithms is that they can be controlled ing time, fuel consumption etc. Therefore, an instance of the by autonomous vehicle driving algorithms that set vehicle’s autonomous vehicle driving algorithm has to be evaluated velocity. several times in order to statistically evaluate its quality. Limited offline simulation The synchronization issues are, however, present only in the online simulation mode. On the other hand, the offline sim- The SCANeR Studio enables to execute the simulation in ulation mode does not have these issues, since it schedules an offline mode that schedules the execution of the SCANeR the execution of the SCANeR modules in advance and then and other modules, involved in the simulation, with respect executes them with respect to the predefined, determinis- to their running frequency, and then executes them with tic schedule. Consequently, the sequence of messages sent respect to the schedule without any sleep time. This en- between modules is also deterministic. ables to execute the simulation as fast as possible, which is specially suitable when the autonomous vehicle driving algo- rithms control the vehicle behavior and no user interaction 3.2 Adaptation of MODS to the SCANeR Stu- is required. dio The limitations of the SCANeR Studio significantly influ- The offline simulation can, however, be used only when user ences the enhancement of the MODS algorithm. Since the executes such a simulation thought the SCANeR user in- MODS algorithm controls the vehicle by setting vehicle’s ve- terface. On the other hand, when the simulation has to locity, it requires to use SCANeR’s simple vehicle models. be executed by a supervisor program, the offline simulation Due to the fact that these models do not simulate engine be- cannot be used. This represents a significant limitation for havior in details, such engine behavior will have to be addi- optimization programs that need to test various parameter tionally implemented. This will require the implementation settings of an autonomous vehicle driving algorithm, where of, for example, wheel friction, aerodynamic, slope friction, each setting has to be evaluated with the execution of the inertial, maximum torque, specific fuel consumption, and same simulation scenario. maximum engine moving functions, in order to limit the ve- hicle inertial force and calculate the fuel consumption. Driving prediction is inapplicable Several autonomous vehicle driving algorithms use the driv- The MODS algorithm will require the usage of offline simu- ing prediction technique to construct the driving strategy lation in order to execute the multiobjective optimization al- and select the best control action at each step. The usage gorithm. Ideally, the multiobjective optimization algorithm of driving prediction requires to periodically (ideally at each should be implemented as a supervisor program. Due to the step) stop the driving simulation, start the driving predic- fact that a supervisor algorithm cannot control the offline tion from the current vehicle and route state possibly mul- simulation, the simulation will have to be controlled with tiple times if various control actions have to be tested, and HTTP POST calls. Although controlling SCANeR with then apply the obtained information during prediction in or- POST calls is not optimal, this is the only mode that can der to continue with the driving simulation. To execute such be currently applied for the offline simulation. a procedure, the main simulator has to simulate the driving, while additional simulators have to be periodically executed Driving prediction is one of the key mechanisms of the MODS to predict the driving. Such a configuration is not supported algorithm. Since this prediction is inapplicable in the SCAN- by the SCANeR Studio, since SCANeR requires to execute eR Studio, the MODS algorithm will have to be significantly only one instance of SCANeR modules simultaneously. Oth- redesigned. To compensate the missing driving prediction, erwise, conflicts can appear in the communication between a simple prediction approach will have to be added to the modules if several instances of the same module are active MODS algorithm, e.g., a set of rules that predict the future simultaneously. collision time. MODS also requires exact evaluation of the driving strate- Synchronization issues gies. Since this can be achieved only by the offline simula- The modules of the SCANeR Studio are implemented and tion mode, only this mode will be used. Consequently, the executed as independent processes that communicate be- driving strategies obtained with the online simulation mode tween them through network. Since they are executed as in- will have to be re-evaluated with the offline simulation mode dependent processes, their actual execution depends on the before the comparison with the MODS driving strategies. hardware and OS specifications, e.g., the number of proces- sor cores that can simultaneously execute various processes. 4. ENHANCEMENTS OF THE MODS AL- Consequently, the sequence of messages sent between mod- ules is not deterministic, which is also due to the network GORITHM latency. For example, messages from vehicle model contain- The MODS algorithm is going to be significantly enhanced ing the current vehicle velocity may sometimes be delayed in order to take into account real-life driving situations and with respect to messages for autonomous vehicle driving al- to be evaluated with the SCANeR Studio. Currently, MODS gorithm containing control actions, which may result in de- is being adapted in order to control the vehicle’s velocity. layed reaction of the autonomous vehicle driving algorithm. In addition, the enhanced algorithm uses only the current 23 and the next velocity limits as hypercube dimensions of the multiobjective optimization algorithm. Applied Soft driving strategies [3]. The other attributes, such as the route Computing, 16:50–62, 2014. inclination, are added to the formulas that change the target [4] E. Dovgan, T. Tušar, M. Javorski, and B. Filipič. velocity. In the future, the hypercube dimensions will be Discovering comfortable driving strategies using redefined by taking into account the results of the analysis simulation-based multiobjective optimization. of the human driving strategies. Informatica, 36:319–326, 2012. [5] S. Elmer. BMW targets 2020 for self-driving cars, To adapt to the SCANeR Studio limitations, the prediction 2013. Available online: of the lower-level MODS is not used anymore. Neverthe- http://www.autoguide.com/auto-news/2013/02/ less, a simple prediction will added to the MODS algorithm bmw-targets-2020-for-self-driving-cars.html. if needed. The lower-level MODS will be also enhanced to [6] K. Fitchard. Ford is ready for the autonomous car. simulate the car following, lane changing, and vehicle over- Are drivers?, 2012. Available online: taking behavior. http://gigaom.com/2012/04/09/ford-is-ready- for-the-autonomous-car-are-drivers/. The upper-level MODS is going to be adapted to take into [7] N. Ingraham. Mercedes-Benz shows off self-driving car account the enhanced definition of the hypercubes. The new technology in its new $100,000 S-Class, 2013. parameters of the driving strategies, which will be stored in Available online: http: hypercubes, will include the vehicle velocity, acceleration, //www.theverge.com/2013/5/18/4341656/mercedes- deceleration, duration of acceleration, duration of decelera- benz-shows-off-self-driving-car-technology. tion, etc. In addition, the upper-level MODS will also be [8] D. Johnson. Audi predicts self-driving cars by 2020, enhanced to search for parameter values for the car follow- 2013. Available online: ing, lane changing, and vehicle overtaking behavior. http://www.leftlanenews.com/audi-predicts- self-driving-cars-by-2020.html. 5. CONCLUSIONS [9] Oktal. Automotive simulators, 2016. Available online: This paper presented the SCANeR Studio simulation en- http://www.oktal.fr/en/automotive/range-of- vironment and its limitations that are relevant for the en- simulators/software. hancement of the Multiobjective optimization algorithm for [10] Oktal. Automotive simulators references, 2016. discovering driving strategies (MODS) and its deployment in Available online: this Studio. Four main limitations were described: limited http://www.oktal.fr/en/automotive/references- implementation of complex dynamic vehicle models, limited automotive-simulator. offline simulation, inapplicability of driving prediction, and [11] Oktal. OKTAL, 2016. Available online: synchronization issues. The analysis of MODS and SCANeR http://www.oktal.fr/. Studio shows that MODS will need several adaptations be- [12] R. Read. Toyota will roll out autonomous cars by the fore the evaluation with the SCANeR Studio, e.g., the imple- mid-2010s, 2013. Available online: http://www. mentation of engine behavior and the algorithm redesign to thecarconnection.com/news/1087636_toyota-will- remove the driving prediction from the algorithm and adapt roll-out-autonomous-cars-by-the-mid-2010s. it accordingly. [13] R. J. Rosen. Google’s self-driving cars: 300,000 miles logged, not a single accident under computer control, 6. ACKNOWLEDGMENTS 2012. Available online: The work presented in this paper was in part funded by the http://www.theatlantic.com/technology/archive/ NERVteh, raziskave in razvoj, d.o.o., which also provided 2012/08/googles-selfdriving-cars-300-000- the SCANeR Studio simulation environment. In addition, miles-logged-not-a-single-accident-under- this work was also in part funded by the Slovenian Research computer-control/260926/. Agency under research project Z2-7581. [14] Toyota. Lane Keeping Assist, 2014. Available online: http://www.toyota-global.com/innovation/ 7. REFERENCES safety_technology/safety_technology/ [1] Audi. Stay in lane, 2014. Available online: technology_file/active/lka.html. http://www.audi.co.uk/new-cars/a3/a3- [15] Volkswagen. Lane assist, 2014. Available online: sportback/driver-assistants/audi-active-lane- http://www.volkswagen.co.uk/technology/ assist.html. proximity-sensing/lane-assist. [2] E. Dovgan, M. Javorski, T. Tušar, M. Gams, and [16] Volvo. Volvo Car Group initiates world unique B. Filipič. Comparing a multiobjective optimization Swedish pilot project with self-driving cars on public algorithm for discovering driving strategies with roads, 2013. Available online: humans. Expert Systems with Applications, https://www.media.volvocars.com/global/en- 40:2687–2695, 2013. gb/media/pressreleases/136182/volvo-car-group- [3] E. Dovgan, M. Javorski, T. Tušar, M. Gams, and initiates-world-unique-swedish-pilot-project- B. Filipič. Discovering driving strategies with a with-self-driving-cars-on-public-roads. 24 Ali nam metode strojnega učenja lahko pomagajo pri načrtovanju novih visokoentropijskih zlitin? Anton Gradišek Andraž Kocjan Miha Mlakar Institut »Jožef Stefan« Institut »Jožef Stefan« Institut »Jožef Stefan« Jamova 39 Jamova 39 Jamova 39 1000 Ljubljana, Slovenija 1000 Ljubljana, Slovenija 1000 Ljubljana, Slovenija anton.gradisek@ijs.si andraz.kocjan@ijs.si miha.mlakar@ijs.si POVZETEK podobne atomske radije – zato pri zamenjavah ne nastopijo Visokoentropijske zlitine (angleško: high entropy alloys, HEA) so problemi zaradi različnih velikosti. zanimiva skupina kovinskih materialov, ki so v zadnjem času Zaradi specifične strukture imajo HEA vrsto obetavnih fizikalnih pritegnile pozornost znanstvenikov zaradi lastnosti, zanimivih za in kemijskih lastnosti, zaradi katerih so pritegnili zanimanje uporabo v industriji. HEA so sestavljene iz petih ali več inženirjev za uporabo v industriji. Gre za odlične mehanske elementov, ki so med seboj naključno pomešani. Kljub mešanju lastnosti (denimo visoka natezna trdnost), odpornost na korozijo, pa nered ne pokvari globalne strukture, ki ostane pravilen kristal. nizko obrabo, obstojnost pri visokih temperaturah itd. Poleg tega Zaradi kompleksne zgradbe HEA težko obravnavamo s klasičnimi je sinteza HEA relativno enostavna (običajno se pripravi talino računskimi metodami znanosti materialov. V tem prispevku ustrezne mešanice elementov in jo ohladi) in ne potrebuje raziščemo, če lahko uporabimo metode strojnega učenja na zbirki naprednih metod, kot je denimo rotacijsko kaljenje, ki se že odkritih HEA, da bi lahko napovedovali strukturo sistemov z uporablja za pripravo sistemov, ki so sicer obstojni v novimi kemijskimi sestavami. Ugotovili smo, da binarni temperaturnem območju, daleč od sobne temperature.[1] klasifikator dobro določi posamezne strukture, s točnostjo tudi nad 90 %, medtem ko bi za natančnejše delovanje klasifikatorja za Specifična struktura in sestava HEA predstavlja določene več razredov potrebovali večjo količino podatkov. probleme pri usmerjenem načrtovanju strukture z metodami, ki so sicer dobro poznane za enostavnejše kovinske sisteme. Pri enostavnih sistemih namreč lahko definiramo osnovno celico Ključne besede kristala, s translacijami te celice pa nato lahko rekonstruiramo Visokoentropijske zlitine, strojno učenje, kristalna struktura celoten kristal. Za matematičen opis zato uporabimo periodične funkcije, kar omogoči računsko določanje različnih lastnosti. V 1. UVOD HEA tega pristopa ne moremo uporabiti. Čeprav je struktura Visokoentropijske zlitine (HEA) so relativno nova skupina enostavna (najpogosteje ploskovno ali prostorsko centrirana kovinskih materialov. Prvič so jih v znanstveni literaturi kocka), zaradi naključne razporeditve elementov ne moremo sistematično opisali leta 2004, bolj intenzivne raziskave na tem govoriti o pravi periodičnosti. Zato so zaželene nove metode, s področju pa so se začele v zadnjem desetletju. HEA so zlitine katerimi bi lahko dobili več informacij o HEA, ki bi nas pripeljale najmanj 5 kovinskih elementov, pri tem, da je vsak od glavnih do bolj načrtovanega načina sinteze novih sistemov.[2] elementov zastopan vsaj med 5 in 35 atomskimi odstotki (lahko so V pričujočem prispevku pogledamo, če namesto eksaktnih prisotni tudi drugi elementi v manjših količinah).[1] Ti elementi fizikalnih izračunov za napovedovanje strukture HEA lahko imajo nizko entalpijo mešanja, zaradi česar v Gibbsovi enačbi uporabimo bazo že znanih sistemov (v člankih [3,4] jih je proste energije (ΔG = ΔH - TΔS, kjer je G Gibbsova energija, H opisanih okrog 300) v povezavi z algoritmi strojnega učenja. Ti entalpija, T temperatura in S entropija) pride do prevlade algoritmi nam lahko odkrijejo povezave med posameznimi entropijskega člena, kjer je entropija definirana z Bolzmannovim parametri, ki jih sicer morda ne bi opazili, saj se pokažejo šele v zakonom o konfiguracijski entropiji. Slednja je najvišja ob visoko dimenzionalnem prostoru. Podobni pristopi se že ekvimolarnem deležu vseh elementov in je tedaj odvisna le še od uporabljajo na področjih, kot je medicina, kjer denimo števila dodanih elementov N, ΔS = R lnN, kjer je R splošna uporabljajo metode strojnega učenja za iskanje povezave med plinska konstanta. Poenostavljeno bi lahko dejali, da entropijski posameznimi geni in vrstami raka, ki bi ga ti geni lahko člen v enačbi (ki predstavlja neurejenost sistema) v primeru HEA povzročali.[5,6] Tu predstavimo prvo enostavno študijo, kjer nas prevlada nad entalpijskim členom, ki bi sicer pripeljal do zanima, kako dobro lahko s pomočjo algoritma napovemo segregacije posameznih elementov oziroma do tvorbe strukturo HEA z znano sestavo. intermetalnih faz z le dvema do tremi elementi in običajno kompleksnejšo strukturo, kot jo imajo HEA zlitine sicer. Če namreč pogledamo fazne diagrame za zlitine iz dveh ali treh 2. PODATKI IN METODA elementov, vidimo, da v njih nastopa množica različnih faz in Za bazo podatkov smo uporabili seznam HEA iz članka avtorjev struktur, kar pomeni, da imajo take zlitine običajno kompleksno Toda-Caraballo in Rivera-Diaz-del-Castillo [6], kjer je opisanih mikrostrukturo. Posledično so običajno krhke in brez posebne 323 znanih HEA. Za vsakega od sistemov so navedeni naslednji praktične uporabe. Po drugi strani pa ravno entropijski člen v podatki: HEA povzroči, da se atomi različnih elementov uredijo v relativno majhno število enostavnih struktur. Pri tem je potrebno poudariti, • Družina oz. seznam elementov, ki v HEA nastopajo, denimo AgAuCuPdSi. da v HEA običajno nastopajo elementi, katerih atomi imajo 25 • Kemijska sestava, ki pove tudi razmerje med elementi. 3. REZULTATI Pri HEA CoCrFeNiCu so elementi zastopani v enakih Klasifikacijskega problema se lahko lotimo na več načinov. atomskih razmerjih, pri Ti0,5Co1,5CrFeNi1,5Mo0,5 pa ne. Idealno bi bilo izdelati klasifikator, ki bi bil sposoben napovedati katerokoli od struktur, ki nastopajo v naši bazi podatkov. Ker pa • Struktura. Najpogostejše strukture so BMG (bulk metallic glass – kovinsko steklo), FCC (face-centred imamo v več kot polovici od 42 razredov po le en ali dva primera, cubic – ploskovno centrirana kocka), BCC (body- se taka klasifikacijska metoda ne bo dobro obnesla. Če imamo centred cubic – telesno centrirana kocka) ter premalo učnih podatkov, se algoritem namreč dobro nauči kombinacije FCC in BCC. prepoznavati le obstoječe primere, ko dodamo nove primere, pa odpove. • δ% - razlika v atomskih radijih Klasifikacijski model zato najprej poskušamo zgraditi na petih • ΔH razredih, torej na štirih najpogostejših strukturah in razredu mix – mešalna entalpija (v kJ/mol) Ostalo. Kot klasifikacijski algoritem se v tem primeru najbolje • Δχ% - razlika v elektronegativnosti obnese Random Forest (naključni gozd), ki ima 62 % klasifikacijsko točnost. • Ω, parameter, definiran kot Ω=Tm ΔSmix/| ΔHmix |, kjer je ΔSmix entropija mešanja in Tm temperatura tališča Bistveno boljše rezultate dobimo, če zgradimo binarne klasifikatorje. V en razred postavimo primere z določeno • μ, parameter, definiran kot μ= Tm/TSC, kjer je TSC strukturo, drug razred pa predstavljajo vsi ostali primeri. Tudi temperatura spinodalne dekompozicije tokrat se najbolje obnese metoda Random Forest. Klasifikacijske točnosti za binarne klasifikatorje s to metodo so predstavljene v • VEC – koncentracija valenčnih elektronov (ang.= valence electron concentration) Tabeli 2. • e/a – število valenčnih elektronov na število atomov v Tabela 22. Klasifikacijske točnosti binarnega klasifikatorja osnovni celici Izbrana struktura vs. Ostale strukture z metodo Random • s Forest m% – neujemanje medatomskih razdalj • K Struktura Točnost (%) m% - neujemanje parametrov mreže Večino zgornjih parametrov lahko izračunamo iz atomske sestave BMG 91 in parametrov za posamezne atome.[6] Nas je zanimalo, če lahko FCC+BCC 87 na podlagi opisanih parametrov določimo fazno sestavo ter njihovo kristalno strukturo. Število HEA s posamezno strukturo je BCC 77 predstavljeno v Tabeli 1. FCC 92 Tabela 1. Število HEA s posamezno strukturo v bazi podatkov Vidimo, da so klasifikacijske točnosti binarnega klasifikatorja Struktura N veliko višje od tistega, ki deluje na petih razredih. V primeru večkratnega klasifikatorja imamo namreč v vsaki od množic BMG 76 majhno število elementov, pri binarnem klasifikatorju pa nekaj od FCC+BCC 74 teh množic združimo in tako povečamo število elementov. BCC 31 Metoda Random Forest deluje tako, da na naključnih podmnožicah učne množice zgradi odločitvena drevesa, nato pa s FCC 31 pomočjo teh dreves klasificira nov primer – končni razred je tisti, ki ga napove največ dreves. Čeprav je ta metoda najbolj uspešna Ostalo* 111 pri klasifikaciji, ni intuitivno razumljiva za uporabnika. Podatke * V razredu Ostalo se nahajajo vse preostale strukture, ki so lahko namesto tega predstavimo s posameznim odločitvenim zastopane z manj kot 30 primeri. Gre za intermetalike (IM), drevesom (klasifikacijski algoritem J48), kjer v vsakem listu heksagonalno celico (HCP), Lavesovo fazo (L), kombinacije drevesa naredimo odločitev glede na določen parameter. Primer zgoraj navedenih in drugo. Skupaj imamo 42 možnih struktur, takega odločitvenega drevesa za binarni klasifikator BMG vs. vendar pa je večina le-teh zastopana z le po enim ali dvema Ostale strukture je prikazan na Sliki 1. Pod spremenljivko Value primeroma. prva vrednost označuje število ostalih struktur, druga pa število BMG. Klasifikacijska točnost tega drevesa je 89 %. Vidimo, da je najpomembnejši parameter δ%, sledita mu ΔH Napovedovanje strukture HEA lahko obravnavamo kot mix in μ. klasifikacijski problem, ki se ga lotimo z algoritmi strojnega učenja. Pri tem strukturo obravnavamo kot razred, devet 4. DISKUSIJA parametrov (δ%, ΔH Kot pokaže naša enostavna študija, je uporaba metod strojnega mix, Δχ%, Ω, μ, VEC, e/a, sm% in Km%) pa kot atribute. Preizkusili smo več algoritmov, ki so vgrajeni v učenja lahko koristna pri napovedovanju strukture program Weka.[7] Vsakič smo klasifikacijsko točnost preverili z visokoentropijskih zlitin. V znanosti materialov je še posebej metodo navzkrižne validacije, pri kateri začetno množico zanimivo iskanje prehodov med posameznimi strukturami. Naš podatkov razdelimo na N podmnožic, nato zgradimo skupen pristop nam omogoča izdelavo enostavnega simulatorja strukture. klasifikacijski model na N-1 podmnožicah in ga testiramo na Če nas denimo zanima prehod med BMG in BCC, ko preostali. To ponovimo N-krat. spreminjamo razmerje dveh od petih elementov v zlitini, nam tak 26 simulator lahko pove, okrog kakšne sestave lahko pričakujemo prehod med strukturama. Na ta način lahko k sintezi novih sistemov pristopimo bolj sistematično in ne le na podlagi poskušanja. Naše modele lahko izboljšamo z večjo količino podatkov v učni množici, preučiti pa velja tudi, ali lahko uvedemo še dodatne atribute, ki bi v model vnesli nove informacije. Slika 1: Odločitveno drevo za binarni klasifikator BMG vs. Ostale strukture. 5. REFERENCE [1] Tsai, M. H., Yeh, J. W., High-Entropy Alloys: A Critical Review, Materials Research Letters, 2014, 2, 107-123. [2] Murty, B.S., Yeh, J.W., Ranganathan, S. High-Entropy Alloys, Butterworth-Heinemann, Elsevier, 2014, p. 57. [3] Poletti, M. G., Battezzati, L. Electronic and thermodynamic criteria for the occurrence of high entropy alloys in metallic systems, Acta Materialia, 2014, 75, 297-306. [4] Toda-Caraballo, I, Rivera-Diaz-del-Castillo, P. E. J., A criterion for the formation of high entropy alloys based on lattice distortion, Internetallics, 2016, 71, 76-87. [5] Wang Y. et al., Gene selection from microarray data for cancer classification—a machine learning approach, Computational Biology and Chemistry, 2005, 29, 37-46 [6] Guyon I. et al., Gene Selection for Cancer Classification using Support Vector Machines, Machine Learning, 2002, 46, 389-422. [7] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten I. H. The WEKA Data Mining Software: An Update; SIGKDD Explorations, 2009, 11, 1. 27 Markov chain model for energy-efficient context recognition Vito Janko Mitja Luštrek Jožef Stefan Institute Jožef Stefan Institute Jožef Stefan International Postgraduate School Jožef Stefan International Postgraduate School Jamova 39, 1000 Ljubljana, Slovenia Jamova 39, 1000 Ljubljana, Slovenia vito.janko@ijs.si mitja.lustrek@ijs.si ABSTRACT sound level, or some higher level activities such as ”shop- Continuous sensing of the user and subsequent context recog- ping”, ”traveling” or ”working”. nition using wearable sensors is a popular area of research. One of the big problems of such automatic context recogni- A major limitation of such continuous sensing and context tion is the battery life of the sensing device. To maximize recognition is its heavy toll on the sensing device’s battery the energy efficiency, the context recognition system should life. This is especially relevant for smartphones, which have adapt its settings to the current situation. Choosing the ap- a very limited battery that must be shared between many propriate setting for each situation usually requires either a applications, but the same limitation applies to basically lot of expert knowledge or extensive experimentation. We any wearable device. This issue is often neglected when dis- propose a method that simulates all possible combinations cussing the design of context-recognition systems, however of contexts and settings using a Markov chain model, au- it is an essential component if such systems are to be used tomating and speeding up the whole process. We show on in practice. a small example that the simulation is accurate, and that it allows us to quickly select best trade-off between the energy Some solutions deal with this issue by optimizing the sam- efficiency and context-recognition accuracy. pling rate or sensor duty cycle for the particular recognition task [1]. This alone however, might be suboptimal. We Categories and Subject Descriptors might for example want to have the GPS active at a high sampling rate while the user is driving, but at a very low G.3 [Probability and statistics]: Markov processes; H.2.8 rate when he is working in the office. This calls for dynamic [Database Applications]: Data mining changes in the settings for both the sensors and for the sub- sequent processing. Many adaptive approaches already exist Keywords [3][4][5][6]. Context recognition; continuous sensing; energy efficiency; Markov chain; acitivity recognition. A problem with these and similar solutions is that they have complex pipelines and/or require many parameters specific 1. INTRODUCTION to the particular recognition problem. If we were to recog- Widespread accessibility of wearable sensing devices allows nize different activities using different sensors we would have for many possibilities for tracking the users who wear them. to adapt these parameters, requiring either a lot of experi- Possible applications range from measuring their exercise mentation or expert knowledge. It would therefore be useful patterns and checking on their health, to giving them location- to have a method that can be provided with a sensor-rich specific recommendations. The recognition of user’s context dataset, and it would be able to tell which sensors or which using sensors is a popular and mature area of research [2]. sensor setting to use in each context. For example, using activity recognition, we can recognize ”walking, ”running”, ”resting” and similar activities from A relatively simple solution of the above problem can be accelerometer data. This task was made easier and more found in the work of Yan, et al. [6]. They select the sensor practical with the increased use of smartphones, which have and attribute settings based on the last classified activity. many sensors built in and are often carried. Sensing with For each activity they find the setting that best recognizes multiple sensors, possibly at once, opens additional options it, by testing all the settings on their dataset, and then use for context recognition: detecting one’s location, ambient it when that activity is detected. However, since the selec- tion is made for each activity in isolation, the effect on the whole system can be unpredictable. To illustrate: an ac- celerometer is very good at recognizing walking and resting, while a GPS is very good at recognizing driving. However, if we have only the accelerometer active while walking, driving will never be detected and the sensor switch will never oc- cur. To take such interactions into account, we would have to run an experiment that has a specific setting for each activity and switches between them in runtime. Since there 28 are many (activity, setting) combinations, this process might S(c1, s1) - the Markov state with context c and setting s. be prohibitively time consuming, possibly taking months on T (a, b) - the probability in the dataset that the next context large datasets with many possible settings. will be b given that the current one is a. C(s) - the set of all contexts that use setting s. We propose a Markov chain model to simulate runtime set- Cs(c1|c2) - the probability that the classifier that works with tings switching and predict what would happen if we used setting s will classify an instance to c1, if the true context is a particular setting for a particular context. Using this c2. model requires only a small amount of actual experimenta- tion. Since the experimentation is a lot more time consum- ing than the simulation, many more combinations can be Having all transition probabilities, we can use the basic tried out and a better combination is therefore expected to Markov chain calculus to calculate the steady state of the be found with the same resources. In this paper we explain Markov chain. This gives us the amount of time the system the proposed Markov chain and try it on a simple activity will be in each of the states. Since we know how much time recognition dataset. any setting is active and how much energy this setting con- sumes per time unit, the energy consumption of the whole 2. METHOD DESCRIPTION system can be estimated. Additionally, since confusion ma- Suppose we have a sensing system that works as follows: trices give us the accuracy for each state, we can calculate the user is in one of predefined contexts (for example, the the accuracy of the whole system. context can be the current activity) and each context is as- sociated with some setting. Settings can be which sensors are in use, or what sampling frequency or duty-cycle policy X Energy estimation = t(m)e(s(m)) is active, or which feature set is used for classification, and m∈M so on. Using the current setting, we watch for a possible context change, and if it happens, we switch the sensor set- ting to the one assigned to the new context. For example: if M - the set of all states in the Markov chain. we are sitting we need a lower sensor frequency, compared to t(m) - the predicted proportion of time spent in state m. when we are walking. To determine the optimal parameters e(s) - the energy requirement of a setting s in a given time for such a system, we might want to try every combination unit. of assignments of reasonable settings to possible contexts. If s(m) - the setting corresponding to state m. we have c contexts and s settings, we would have sc different combinations. Our goal is to make only s experiments and simulate the rest, gaining a drastic increase in efficiency. X Accuracy estimation = t(m)acc(c(m), s(m)) We begin by selecting all reasonable settings. For each of m∈M them, we make an experiment where the classification model is trained and tested with this particular setting. For each experiment, we calculate and remember the confusion ma- acc(c, s) - the accuracy of the classifier that works with set- trix. Additionally we need the transition probability from ting s, if the true context is c. one state to the next one; that can easily be inferred from the c(m) - the context corresponding to state m. dataset. Finally we must make energy consumption estima- tion for each setting. Energy consumptions for most sensors at different configurations are known or can be estimated It should be noted that many other metrics can be deter- with simple measurements. mined from such a model. Example: the accuracy for a par- ticular activity or the latency of activity change detection. To simulate an experiment where the settings are dynami- They can be used instead of the accuracy when evaluating cally switched depending on the current activity, we create the performance. a Markov chain. This model has a state for each possible (context, setting) combination. The Markov state (c,s) rep- Every simulated experiment represents a possible trade-off resents that we are currently in context c, with the system between the accuracy and energy consumption. Simulating having setting s (which depends on which context the sys- all the combinations, the Pareto optimal trade-offs can than tem believes we are currently in). be presented to the application designer, which - knowing the energy and accuracy requirements of the application - Next we have to calculate the transition probability from can then choose the ideal settings for it. one Markov state to another. They can be calculated from the transition probabilities of contexts and data from the 3. EXPERIMENTAL SETUP previous computed confusion matrices. Intuitively: we get We will demonstrate our method on a simple example. We in a state (c,s) if the context really changes to c and if the have a dataset of accelerometer data generated by a smart- system classifies this instance into one of the contexts that phone, and we want to classify basic activities (Table 1) from have assigned setting s. it. We can do this by using two different settings: a high frequency sampling (50 Hz) and a low frequency sampling (2 Hz). Two extremes were chosen for simplicity. In practice X S(c we would perhaps also try other frequencies (10 Hz, 20 Hz 1, s1) → S(c2, s2) = T (c1, c2) Cs (c|c 1 1) c∈C(s etc...) and duty cycles. We assign one of the two frequencies 2 ) 29 to each activity, generating 16 different combinations. All 16 combinations will be simulated by doing only 2 actual experiments. In this case we might expect that the best combination would be using the low frequency for the activ- ity rest, and the high frequency for the other activities, but the trade-offs are not so clear in general. The confusion matrices are in Table 2 and Table 3. States of the generated Markov chain are in Figure 1. Note that such a chain is in principle fully connected, including the connections of each state to itself. s w r c 62.7 21.1 10.4 5.8 Table 1: Proportion (in %) of each activity in the dataset. The values are in %. s - rest, w - walking, r - running, c - cycling Figure 1: Markov chain states for our example. Ver- s w r c tical axis signifies the true activity, while the hori- s 97.6 1.4 0.0 1.0 zontal signifies the setting, which depends on the w 5.6 86.2 5.5 2.7 last classified activity. r 0.7 10.0 89.1 0.2 c 22.9 16.4 0.0 60.7 Table 2: The confusion matrix using the high fre- We also plotted the Pareto front that shows the sensible quency. The values are in %. solutions. We see that the Markov chain simulation points very closely correspond to the non-simulated ones. The sim- ple simulation captures the general trend of the Pareto front, s w r c but makes substantial mistakes in predicting the actual val- s 97.2 2.3 0.1 0.4 ues. If the interaction between sensors and activities were w 9.5 82.4 7.2 0.9 more complex, we expect the error to be be even greater. r 3.9 28.4 67.7 0.0 The error is numerically evaluated in Table 4. c 46.8 32.6 1.5 19.1 acc. energy Table 3: The confusion matrix using low frequency. Markov 2.03 0.35 The values are in %. We can observe that the accu- Simple 12.82 1.83 racy for rest does not change much compared to the high frequency case, while the accuracy loss when Table 4: The average prediction mistake, made on cycling is quite drastic. energy consumption and classification error for the simple and Markov chain model. The values are given in % with respect to the maximal value for We also did a simpler simulation in the spirit of Yan, et. the corresponding axis. al [6], where every activity was considered in isolation. In this case, the accuracy of the system was computed simply by computing the accuracy for each activity given its set- We also explored what happens if the underlying activity ting and then weighted by this activity’s proportion in the distribution in the dataset changes. This can be easily simu- dataset. lated by modifying the transition probabilities of the Markov chains. It turns out that while the values themselves change 4. RESULTS drastically, the overall shape of the Pareto front remains sim- ilar. This means that the best simulated trade-off likely re- The method was evaluated in the following way. First all mains the best with most other activity distributions. Such the simulation trade-offs were computed (both Markov chain simulation is also handy to see if the energy requirements ex- simulation and the simple one). Then we ran the actual ceed the application limits if some condition changes. Note experiments we were simulating, switching classifiers and that no additional experiments with the actual data were sampling frequencies during the runtime. All three sets of needed to generate this information, which is an additional results were plotted in Figure 2. benefit of our method. The trade-offs in Figure 2 are marked with letters that cor- respond to the activities where the low frequency was used. The case where only rest was used with the low frequency (marked as ’s’) could be considered the best trade-off be- tween the energy gained compared to the accuracy lost. 30 6. REFERENCES [1] Aftab Khan, Nils Hammerla, Sebastian Mellor, and Thomas Plötz. 2016. Optimising sampling rates for accelerometer-based human activity recognition. Pattern Recognition Letters (2016). [2] Seon-Woo Lee and Kenji Mase. 2002. Activity and location recognition using wearable sensors. IEEE pervasive computing 1, 3 (2002), 24–32. [3] Hong Lu, Jun Yang, Zhigang Liu, Nicholas D Lane, Tanzeem Choudhury, and Andrew T Campbell. 2010. The Jigsaw continuous sensing engine for mobile phone applications. In Proceedings of the 8th ACM conference on embedded networked sensor systems. ACM, 71–84. [4] Jeongyeup Paek, Joongheon Kim, and Ramesh Govindan. 2010. Energy-efficient rate-adaptive GPS-based positioning for smartphones. In Proceedings of the 8th international conference on Mobile systems, applications, and services. ACM, 299–314. [5] Yi Wang, Jialiu Lin, Murali Annavaram, Quinn A Jacobson, Jason Hong, Bhaskar Krishnamachari, and Norman Sadeh. 2009. A framework of energy efficient Figure 2: Black points are real trade-offs, blue mobile sensing for automatic user state recognition. In points are simulated with Markov chains, red points Proceedings of the 7th international conference on are simulated with the simple model. The Pareto Mobile systems, applications, and services. ACM, fronts are drawn using corresponding colors. The 179–192. lower left corner represents the point with the low- [6] Zhixian Yan, Vigneshwaran Subbaraju, Dipanjan est error and the lowest energy consumption. We Chakraborty, Archan Misra, and Karl Aberer. 2012. can see that Markov chain simulation corresponds Energy-efficient continuous activity recognition on very closely to the values of real experiments. mobile phones: An activity-adaptive approach. In Wearable Computers (ISWC), 2012 16th International Symposium on. Ieee, 17–24. 5. CONCLUSION AND FUTURE WORK The simulations display a very high decree of fidelity to the actual experiments and are very fast. Our method can thus effectively tackle the important but difficult task of selecting system settings that give us a good compromise between the accuracy and energy efficiency. Future work on the topic will include testing the proposed method on a more complex dataset that contains more sen- sors and activities to see if the results still have the same fidelity. Another improvement will be to explore options for searching the settings combination space more efficiently. Since this is essentially a multi-objective optimization prob- lem, many approaches from that area can be then used to further increase the effectiveness of our method. 31 Reliability estimation of individual predictions for incremental models Anže Javornik Igor Kononenko Darko Pevec univ. dipl. inž. rač. in inf. Ph. D., Professor PhD. Dragomelj 150 Faculty of Computer and Inf. Sc. Faculty of Computer and Inf. Sc. 1230 Domžale Večna pot 113,1000 Ljubljana Večna pot 113,1000 Ljubljana +386 41 550 515 +386 1 479 8230 +386 1 479 8230 anze.javornik@pointar.si xaigor@fri.uni-lj.si darko.pevec@gmail.com ABSTRACT Since the knowledge is static and based on the data that was Reliability estimate of an individual prediction can represent a very gathered from the sensor when it was still in a good operating form, important information, especially if a wrong decision can have the accuracy of the model and reliability estimate will drop. To serious consequences. Same as classical prediction models, battle such issues an incremental approach is needed. classical methods for reliability estimation are based on known examples – the learning set. Such methods cannot be used with 2. INCREMENTAL LEARNING incremental models, since this knowledge is not available. In this Classical learning requires that enough data is available to construct article we present three known methods of reliability estimation knowledge. Incremental learning has no such requirements. In the and propose their incremental versions. We tested incremental and process of incremental learning each new marked example is used classical methods using four incremental models on eight data to refine the current model. streams with and without drift. Obtained results were statistically analyzed and compared. 2.1 Incremental Naive Bayes Classifier Naive Bayes classifier makes prediction with an assumption that Categories and Subject Descriptors different attribute values with given class are conditionally I.2.m [Artificial intelligence]: Miscellaneous independent. The classification formula is based on Bayes rule: | ) General Terms | ) = ) ) Algorithms, Reliability. The task of learning process is to use the learning set and Keywords approximate probabilities on the right hand-side of equation. The knowledge of naive Bayes classifier is a table of approximations of Incremental, Machine learning, Classification, Reliability estimate prior probabilities of classes P(rk) and a table of conditional probabilities P(rk | vi) of classes rk with value vi for attribute Ai. 1. INTRODUCTION In incremental learning the algorithm needs to refine prior Task of machine learning is to construct a model, which is used for probabilities and conditional probabilities. In [2] the adjustment forecasting. The learning process uses known examples to adapt the formula for prior probabilities is defined as prediction model in such a way that it is able to solve problems 1 better. The result of learning process is knowledge which can be a set of remembered examples, an algorithm or a set of rules which ′ ) = + 1 ) + 1 + ; = describe the problem domain [1]. Such knowledge is called a model, which can be used for understanding the problem better or + 1 ) ; ≠ for predicting new examples. where The process of forecasting is closely linked to the concept of = | | + | | reliability. In situations when wrong prediction can have serious and for conditional probabilities consequences, we would like to know how accurate the prediction 1 is. However, for an unmarked example we cannot provide an exact accuracy but can only estimate it. 1 + , ) + 1 + ; = !" = , ) = Classical prediction models and methods of reliability estimation 1 + , ) ; = !" ≠ are based on examples from a problem domain and so their , ) ; ≠ knowledge is static. Often the problem domain changes over time. where Consider the sensor readings whose sensitivity is decreasing. = | | + #$%"& ') where |Ai| represents the number of values of attribute Ai and count(rj) represents the number of examples whose class is rk. 32 2.2 Incremental Decision Trees that it forgets previously seen examples and is thus sensitive to Decision tree is composed of internal nodes that correspond to example distribution. attributes, branches that correspond to subsets of attribute values The space adjustment strategy which we developed in our work and leaf’s that correspond to classes. A path from root to a leaf differs from the sliding window in a sense that it does not forget corresponds to one decision rule. The task of the learning process previously seen examples. Instead of tracking the space of specific is to build a tree from the learning set in such a way, that for each examples (xi, Ci) we track the space of examples’ description current level it finds a most important attribute – one who splits the (x ’ ’ i , Ci ). Let example set best – and use it as the root of current subtree. 3 = 4 5 , 6 , … , 58, 68 9 Incremental algorithms are refining the decision tree for each new be an example space where each example x is defined with a class example. The goal of the algorithm for incremental learning of label C and a set of attributes decision tree is to construct the same decision tree as non- incremental algorithm on the same set of examples. An example of 5 4: , … , :89 incremental learning algorithm is ID5R, which for each new A description of example space E is defined as (x ’ ’ E , CE ) where example finds most important attribute Ai and lifts it towards the 5< 4:< , … , :< , 6<9 root of the tree and refines the tree [3]. ; 8 :< ⨍ 5 . : , … , 5 2.3 Incremental Bagging and Boosting 8. : < Traditional algorithms for supervised learning give their result 6; ⨍ 5 .6 , … ,58. 6 according to the individual basic prediction model. Aggregated ⨍ :, ? : @ ∗ ? / : learning algorithms combine predictions of many such base models Let where each of them is learnt in traditional way. Bagging and boosting are well known aggregate algorithms for which it was ℙ 4 5<, 6< , … , 5< , 6< 9 shown that they provide better results compared to individual be a set of descriptions where 5< is a description and 6< the class models [4]. of description. For each new example 5, 6 we need to decide Bagging is a method in which M different learning sets are either we join this example into existing description or start generated using randomly selected examples from original learning building a new description with it. We based this decision on ratio set which can be repeated. Each generated learning set contains between the closest and the furthest description with the same class each example K-times, where label. If the ratio is smaller than some theta value we join it with 1 1 01 the closest description of the same label, else we start building a ( = )) = *+),- new description. In case of building a new description, we also +. -1 / +. implemented the logic to reduce space size. If the number of total which is a binominal distribution. With N → ∞ distribution of K remembered descriptions exceeded the allowed amount, then in the approximates to Poisson(1) distribution. whole space we find the closest two descriptions with the same This feature is utilized by incremental bagging. For each new class label and join them into one. Using the mentioned method on example for each individual base model M the algorithm chooses first hundred examples of SEA generator [6] the space contained this example K ~ Poisson(1) times and refines the model M. only fifty-nine descriptions. On the left side of Figure 1 we see how the space would look like if we would remember every example Boosting is a more complex process of base model set generation. and on the right side is the space using the space adjustment Each base model M1, …, Mn is learnt with weighted learning set. strategy. The weight is defined by the classification error of previous base prediction models. Incremental learning for boosting uses Poisson sampling to approximate aggravating. The algorithm works as the algorithm for bagging with a difference that when a base model wrongly classifies an example the Poisson distribution parameter, bound to the example, is incremented for the next base model, else it is decreased [5]. Figure 1. Distribution of examples in internal space In our work we took three reliability scores (CNK, LCV, BAGV) 3. RELIABILITY ESTIMATION from [5] and modified them by adding the incremental nature. Since we do not know the actual result of an example we can only speculate about the prediction accuracy. The speculation is based 3.1 Incremental Local Error Score on previously seen examples. In most cases we focus only on space Classical local error score is defined as follows: let K be the near the current example – that is closest neighbors. models’ prediction of an unmarked example 5, / and + 4 5 , 6 , … , 5 , 6 9 With incremental learning the space of known examples becomes the set of closest neighbors, where Ci is unlimited which poses a problem since we do not have unlimited correct class label of the i-th example. The reliability estimate space nor the time to find the closest neighbors. In our work we (CNK) is then the average distance between the class label of the used two different strategies to deal with space of known examples: closest neighbors and the models prediction sliding window and space adjustment. ∑ ƒ 6 , ( 6+( The sliding window is a simple strategy. In this strategy we ) remember the new examples and forget old examples, always Incremental version (iCNK) for the example space management keeping a specific number in memory. Downside of this strategy is uses previously described strategy, the space adjustment. Instead of 33 using actual examples iCNK operates on closest descriptions 2. for 5 , 6 in ℙ + 4 5<, 6< , … , 5< , 6< 9, and is defined as 3. construct model U "VW ℳ on ℙ ∖ 5 , 6 ∑ ƒ 6<, ( !6+( 4. use model U and get prediction ( for 5 ) 5. calculate distance ƒ 6 , ( where 6< is the class label of the i-th description (Algorithm 1). 6. end for 7. calculate score !R6 ∑ ƒ FG,HG Algorithm 1. iCNK Definitions: N – max. number of descriptions 8. refine ℙ with (x, K) E – theta 9. if |ℙH| > + @ – adjustment rate 10. find descriptors , ' ∈ ℙ , for which ℙ – description set min ƒ , ,' ' and . 6 '. 6 k – num. of relevant closest neighbors 11. in ℙ replace , ' with ' Input: example with prediction (x, K) 12. end if Output: reliability estimate 13. return score 1. find set of k closest neighbors ℙ 2. I calculate score !6+( @, + ∑GJK ƒ FG,H L 3.3 Incremental Variance of Bagging Model 3. refine ℙ with (x, K) The aggregated prediction model, which is composed of m 4. if |ℙ| > + prediction models, provides its prediction as the average of individual predictions: 5. find descriptors , ' ∈ ℙ , for which Y min ƒ , ( ,' ' and . 6 '. 6 ( S 6. in ℙ replace , ' with ' 7. The reliability estimate BAGV is defined as the variance of end if prediction distributions: 8. return score 1 Y Z [ S ( / ( \ 3.2 Local Cross-Validation Score The incremental version for space management uses a version of The main idea behind LCV score is that it is independent from the “sliding window” strategy. We define iBAGV on space models’ prediction. From the set of closest neighbors + ℳ ]U , … , U 4 5 , 6 , … , 5 , 6 9 8, … , U8^_à ,b∗8 c, where Mi is an incremental it builds k models using cross validation model, d the rate of forgetting and n a predefined number of “leave one out” and uses the model to predict the example left out. models. The position (order) of a model M in ℳ is also important. The score is then defined as In a sense we have converted a set of aggregated models in BAGV ƒ 6 , ( algorithm to a list and extended it to include additional " R6 S max 1, d ∗ " ) models. The models under indexes 1 to n are then used for calculating the score, while models under indexes n + 1 to Incremental version of the algorithm (iLCV) uses the adjustment the end are used for incremental learning. When it is time to space strategy for managing examples and is thus operating on implement the forgetting feature, we take first ∗ d models from + 4 5<, 6< , … , 5< , 6< 9 (see Algorithm 2) start of the list, re-initialize them and push them to the end of the list, forcing previous models towards start (see Algorithm 3). Algorithm 2. iLCV Algorithm 3. iBAGV Definitions: N – max. number of descriptions Definitions: ℳ– prediction model list E – theta g – adjustment rate @ – adjustment rate d – forgetting rate ℙ – description set n –number of models k – num. of relevant closest neighbors bagInit – is bag already initialized ℳ– base incremental model count – counter for applying forgetting Input: example with prediction (x, K) Input: example with prediction (x, K) Output: reliability estimate Output: reliability estimate 1. find set of k closest neighbors ℙ 34 1. define relevant model set ℳh 4U 9, where U ∈ ℳand Results show that incremental scores and classical scores had about ! ≤ " the same percentage of statistically significant positive correlation. The estimate iBAGV also shows a good percentage of statistically 2. for U in ℳh significant negative correlation. However, estimates iCNK and 3. calculate prediction ( with U for 5 iBAGV in most tests provided a higher correlation coefficient. The estimate iLCV performed about as well as its classical version. 4. end for 5. calculate score !Z [ ∑Y ( / ( \ 100% Y 6. define learning set ℳjk l8 m ℳh, "$& ?:no"!& ℳ/ℳh, ?:no"!& 50% 7. randomly choose ℳq rℳjk l8r ∗ 1 g models 0 with repeat from ℳjk l8 8. use (x, K) to learn models in ℳq -50% 9. increase count by 1 10. if #$%"& ≥ " -100% 11. replace first " ∗ d models from ℳ with new and push Figure 3. Results on generators with drift them into the end of list ℳ Results on generators with drift are shown in Figure 3. Additional 12. set #$%"& = 0 and bagInit = True gray columns represent the change compared to generators without 13. end if drift. We can notice a good drop in statistically significant positive correlations in classical estimates CNK and BAGV and the increase 14. return score in statistically significant negative correlations in the incremental estimate iLCV. Incremental scores show a much lower drop in percentage of statistically significant positive correlations. The 4. TESTS AND RESULTS same as in tests on generators without drift, incremental scores We tested the developed algorithms and their classical versions on iCNK and iBAGV in most tests, where both correlation coefficients 4 incremental models (Naive Bayes, Hoeffding Tree, OzaBag and are statistically significant, provide higher correlation coefficients OzaBoost) from [7] using 8 generators from [7] with and without than their classical versions. Also we can notice that unlike in test drift. We test the calculated scores against actual error by with no drift, all incremental scores have a higher percentage of calculating the Person’s correlation coefficient which we statistically significant positive correlation. substantiated with statistical significance. 5. REFERENCES [1] I. Kononenko, M. Kukar. Machine Learning and Data Mining. Horwood (2007). 100% [2] S. Ren, Y. Lian, X. Zou, Incremental Naive Bayesian 80% Learning Algorithm based on Classification Contribution 60% Degree, Journal of computers, vol. 9, no. 8, August 2014: 1967 - 1974. 40% 20% [3] Utgoff, P. E. (1989). Incremental induction of decision trees. 0 Machine Learning, 4, 161-186. -20% [4] N. Oza, S. Russell, Online bagging and boosting, In Artificial -40% Intelligence and Statistics 2001, 105–112, Morgan Kaufmann (2001). Figure 2. Results on generators without drift [5] Z. Bosnić, Ocenjevanje zanesljivosti posameznih napovedi z The results of tests on generators without drift are presented in analizo občutljivosti regresijskih modelov, doktorska Figure 2. On the left side filled columns represent the percentage of disertacija, Fakulteta za računalništvo in informatiko, tests where the reliability score had a positive correlation Ljubljana (2007). coefficient and was statistically significant and empty columns [6] W. Nick Street, YongSeog Kim, A streaming ensemble represent the percentage of tests where the score had a negative algorithm (SEA) for large-scale classification, KDD ’01: correlation coefficient and was statistically significant. The right Proceedings of the seventh ACM SIGKDD international side shows the comparison between incremental and classical conference on Knowledge discovery and data mining, 377- versions of reliability scores. The filled column represents the 382 (2001). percentage of tests where both scores had a statistically significant correlation coefficient and incremental version had a higher value [7] Albert Bifet, Geoff Holmes, Richard Kirkby, Bernhard of it, while the empty columns show where higher value of Pfahringer (2010); MOA: Massive Online Analysis; Journal correlation coefficient was held by the classical version. of Machine Learning Research 11: 1601-160. 35 Dve nadgradnji algoritma vzvratnega razširjanja Bojan Ploj Martin Terbuc School Centre Ptuj School Centre Ptuj Volkmerjeva cesta 19, 2250 Ptuj Volkmerjeva cesta 19, 2250 Ptuj Slovenia, Europe Slovenia, Europe +386 2 787 1 800 +386 2 787 1 812 bojan.ploj@scptuj.si martin.terbuc@scptuj.si ABSTRACT val napredka je prinesel algoritem vzvratnega razširjanja [3] In this article two new algorithms for learning Feed-Forward (angleško backpropagation), ki omogoča učenje večslojnih Artificial Neural Network are presented. In the introduction, a nevronskih mrež s povezavami naprej (FFANN) na osnovi brief description of the development of the existing algorithms gradientnega zmanjševanja. Najbolj znana med FFANN je and their flaws are shown. The second part describes the first new večslojni perceptron (MLP), ki lahko rešuje tudi najbolj zahtevne algorithm - Bipropagation. Basic idea is given first, followed by a probleme, če ima za to na voljo dovolj veliko število nevronov in detailed description of the algorithm. In the third part yet another slojev [4]. V praksi se MLP izkaže kot razmeroma zmogljiv new algorithm is given, called Border Pairs Method. Again is first inteligentni sistem, njegov učni algoritem pa ima žal številne given a basic idea and then follows a detailed description of the pomanjkljivosti, zaradi česar se učenje pogosto konča neuspešno. algorithm. In the fourth part the results and findings of Kadar so učni podatki nekoliko obsežnejši, se učni čas zelo experimental work are presented. In the conclusion, it is found podaljša, hkrati pa se za nameček še zmanjša uspešnost učenja. that two described algorithms are fast and reliable - the second Dodatna težava je še dejstvo, da ne poznamo optimalne oblike one is also constructive. FFANN (število slojev in nevronov v posameznem sloju), kar je dokaj pomembno za dober učni rezultat. Premajhna FFANN Categories and Subject Descriptors namreč ne zmore iz podatkov posrkati vsega znanja, prevelika pa • Computing methodologies~Artificial intelligence se nauči prekomerno. In ne samo to, prava sestava FFANN še • Computing methodologies~Supervised learning • Computing vedno ni garancija za uspešno učenje. Med učenjem namreč methodologies~Supervised learning by classification zmanjšujemo učno napako na osnovi gradienta, zato se lahko • Computer systems organization~Neural networks • Theory zgodi, da zaidemo v lokalni minimum in tam obtičimo. Uteži – of computation~Design and analysis of algorithms • Theory of prosti parametri FFANN, ki jih med učenjem optimiziramo, so pri computation~Machine learning theory algoritmu vzvratnega razširjanja sprva izbrani naključno, kar pomeni, da je uspeh učenja odvisen tudi od sreče. Vse našteto nas je privedlo do iskanja alternativnih učnih algoritmov, med General Terms katerimi smo odkrili dva zanimiva. Algorithms, Design, Reliability, Experimentation, Theory, 2. BIPROPAGATION Keywords Prvi novi algoritem se imenuje bipropagation [5, 6] in je manjša Artificial Neural Network, Machine Learning, Supervised nadgradnja algoritma vzvratnega razširjanja. Osnovna ideja Learning, Bipropagation, Border Pairs Method, Multi-layer izboljšave je v tem, da začetne vrednosti uteži in želene vrednosti Perceptron notranjih slojev ne prepuščamo več naključju kot pri vzvratnem razširjanju, ampak jih izračunamo oziroma določimo. Kako to 1. UVOD naredimo bomo prikazali kasneje na primeru fotografij. Prvi modeli umetnih nevronskih mrež (ANN) so nastali pred več Namenoma smo uporabili izraz »notranji sloj« in ne več »skriti kot sedemdesetimi leti (Warren McCulloch in Walter Pitts) ter od sloj«, kot je to običaj pri vzvratnem razširjanju. Če namreč takrat naprej niso doživeli korenitih sprememb. Izdelava novega poznamo želene vrednosti notranjih slojev, se lahko lotimo učenja modela je razmeroma preprosta, saj lahko nevrone tako rekoč vsakega sloja posebej. Tako razbijemo zahteven problem na več poljubno povezujemo med seboj. Težji del naloge je najti manjših, ki jih dobro obvladamo, saj posamični sloji nimajo algoritem, ki bo omogočal uspešno in učinkovito učenje modela. lokalnih minimumov. Edini še nerešeni problem je torej določanje Eden prvih ANN modelov je bil Perceptron, njegov učni smiselnih želenih vrednosti notranjih slojev. Z intuicijo najdemo algoritem – Hebbovo pravilo [1] so odkrili šele čez nekaj let in je rešitev ali dve tudi za ta problem. Najlažje pojasnimo to na omogočal le reševanje preprostih (linearnih) problemov. Modeli primeru fotografij obrazov, kjer z MLP ugotavljamo spol in njihovi učni algoritmi so od takrat naprej postoma napredovali fotografirane osebe. MLP ima toliko vhodov, kolikor je slikovnih in so zmogli reševanje vse bolj zahtevnih problemov. Ta napredek točk (pikslov) in en binarni izhod, kjer velja moški = 0, ženska = se je dogajal v nekakšnih valovih, pri čemer so se izmenjevala 1. Zaradi lažjega razumevanja bodo vsi notranji sloji imeli enako obdobja počasnega in hitrega razvoja, podobno kot letni časi. število nevronov, kot je število vhodov oziroma pikslov. Sicer so Zatišna obdobja so bila tako izrazita, da so dobila celo svoje ime – ta števila lahko brez zapletov tudi drugačna. Prvi notranji sloj ima zima umetne inteligence [2]. Razlog za to naj bi bil psihološke nalogo, da vse piksle osvetli za en odtenek, če je na fotografiji narave, kajti prebojna odkritja so privedla do vzhičenja moški, oziroma potemni, če je ženska. Tako dobimo želene raziskovalcev, ki so jim nato sledila obdobja razočaranja. Drugi vrednosti prvega notranjega sloja, ki so zelo podobne vhodnim 36 vrednostim. Ostane nam le še izbira uteži prvega sloja. Če bi slika tudi realna števila (prav tako na intervalu od 1 do 6) in tako v prvem notranjem sloju ostala enaka vhodni, bi za uteži ustrezala opazovali tudi navidezni prostor med piksli. enotska matrika (nevroni imajo odsekoma linearno prenosno Ugotovitve so bile naslednje: funkcijo). Ker je slika dejansko zelo podobna vhodni, je enotska matrika dobro izhodišče za izračun uteži in zato za dobro (1) Vsi nevroni, navkljub sigmoidni prenosni funkciji, delujejo v naučenost zadostuje že malo število učnih iteracij. Prav tako je zasičenju. Razen v zelo ozkem obmejnem pasu imajo vedno zaradi male spremembe uteži med učenjem prvega sloja mala tudi izhodno vrednost enako nič ali ena. verjetnost, da med učenjem naletimo na lokalni minimum. V (2) Nevroni v prvem sloju lokalno razmejujejo področja bele in višjih notranjih slojih se zgodba ponavlja, fotografija postane črne barve. glede na spol še svetlejša oziroma temnejša in v zadnjem (3) Nevroni drugega sloja opravljajo logično operacijo IN tistih notranjem sloju postane čisto svetla ali temna. Izhodni sloj ima le nevronov iz prvega sloja, ki omejujejo isto črno področje. en nevron, ki ima trivialno zadolžitev: če dobi na vhod množico samih ničel, da na izhodu ničlo, če pa dobi same enice, pa da (4) Nevron tretjega sloja opravlja logično operacijo ALI nevronov enico. Ko zaključimo učenje posameznih slojev, lahko z metodo iz drugega sloja. vzvratnega razširjanja po želji opravimo še fino uglaševanje. 3.2 Sinteza MLP Posplošitev tega algoritma na razvrščanje med več razredi je Za fotografijo na sliki 1 torej velja naslednje: trivialna. Če bi na primer razvrščali desetiške števke, ne bi temnili cele slike, ampak samo eno desetino. Za enico bi temnili prvo • 4 nevroni v prvem sloju (za vsako mejno črto a, b , c in desetino pikslov, za dvojko drugo in tako naprej. d po eden) Glede hitrosti učenja je bipropagation v prednosti zaradi dveh • 2 nevrona za logično operacijo IN v drugem sloju (za razlogov. (1) Uteži so že pred učenjem podobne pravim. (2) Ker vsako področje A in B po eden) se učijo posamezni sloji, ni možno, da bi spremembe uteži v • 1 izhodni nevron za logično operacijo ALI predhodnih slojih motile učenje v trenutnem sloju. Kljub vsem prednostim pa algoritem bipropagation ostaja iterativen, saj še vedno optimizira uteži z računanjem gradienta v številnih malih korakih. Demo program za algoritem Bipropagation je objavljen na spletu [10]. P1 P2 P3 3. METODA MEJNIH PAROV 3.1 Analiza naučenega MLP Drugi algoritem se imenuje metoda mejnih parov (Border Pairs Method, BPM) [6, 7]. Njegovo izhodišče je popolnoma drugačno kot pri algoritmu bipropagation. Najprej smo MLP naučili do te mere, da je čisto vse učne vzorce razvrstil pravilno, nato pa smo Slika 2. Iskanje mejnih parov opazovali lastnosti notranjih slojev. Za učne podatke smo P2 in P3 sta mejni par, ker je njun presek krožnic prazen. uporabili črno-belo fotografijo brez sivih odtenkov – slika 1. Vhodna podatka sta pikslova vrstica Y in stolpec X (dve celi Zgradba MLP je s tem že določena (glej sliko 3.), nerešen nam števili od 1 do 6) izhodni podatek pa barva piksla (0 = bela, 1= ostane le še drugi del naloge – učenje oziroma iskanje optimalnih črna). Med opazovanjem notranjih slojev smo dajali na vhod MLP vrednosti uteži nevronov. V drugem in tretjem sloju se opravljata logični operaciji IN in ALI. Iskanje optimalnih vrednosti uteži y c d zanju je trivialno. Torej smo z drugim in tretjim slojem že opravili 6 Barva piksla 5 a 4 A B 3 b 2 A B 1 1 2 3 4 5 6 x Slika 1. Razmejitev pikslov v črno-beli fotografiji a b c d a, b, c in d so mejne črte; A in B sta področji X Y Slika 3. Zgradba optimalnega MLP Nekatere sinapse v drugem sloju so izločene. 37 v celoti in nam ostane le še učenje nevronov v prvem sloju, ki določamo po enakem postopku, kot smo ga uporabili pri prvem določajo položaj posameznih lokalnih mejnih črt. sloju v dvodimenzionalnem problemu. Pri tem so izhodni podatki enega sloja vhodni podatki naslednjega. Vsak sloj ima enako ali 3.2.1 Določanje mejnih črt manjše število nevronov od predhodnega. Ko ostane le še en Na sliki 1 so narisane štiri mejne črte (črtkane črte a, b, c in d) v nevron, je postopek končan. dvodimenzionalnem prostoru (dimenziji X in Y). Te mejne črte so Razen za učenje MLP so mejni pari primerni tudi za razšumljanje položene tako, da razdelijo celoten prostor na homogena podatkov in ugotavljanje zahtevnosti učnih podatkov že pred področja. Področji A in B vsebujeta le črne piksle, preostala pričetkom učenja. Razšumljanje z mejnimi pari poteka tako, da področja pa samo bele. Pri risanju mejnih črt izhajamo iz udeležence mejnih parov nekoliko premaknemo v takšni smeri, da predpostavke, da mejne črte potekajo med črnim in belim piksli, jih približamo najbližjemu podatku istega razreda, ki ni udeležen ki so si blizu. V ta namen bližnji črno-bel par poimenujemo v mejnem paru. Ostalih podatkov sploh ni potrebno korigirati, saj »mejni par«, pri čemer besedo »bližnji« definiramo s presekom na učenje vplivajo samo mejni pari. krožnic. Krožnici narišemo tako, da je en piksel v središču kroga, drugi pa na krožnici. Če v preseku obeh krožnic ni tretje točke, Zahtevnost učnih podatkov ugotavljam že pred pričetkom učenja potem rečemo, da sta si točki blizu, oziroma da mejita. Piksla P iz deleža podatkov, ki so udeleženi v mejnih parih. Kadar so učni 2 in P podatki nezahtevni, jih je le nekaj % udeleženih v mejnih parih. V 3 na sliki 2. torej tvorita mejni par. Če bi narisali krožnici za P nasprotnem primeru sta dve možnosti: velik šum v podatkih ali pa 1 in P3, bi ugotovili, da je P2 v preseku njunih krožnic ter zato P1 in P podatki skrivajo v sebi zapletena pravila. Lahko pa je tudi 3 nista mejni par. Slika 4. prikazuje vse mejne pare za našo črno-belo sliko. Skupaj je na sliki 12 mejnih parov, ki jih sedaj kombinacija obojega. Podatki v obravnavanem primeru so želimo ločiti s premicami. Gremo po vrsti in najprej ločimo par 1. ekvidistančni, a eksperimenti so pokazali, da algoritem deluje enako dobro tudi pri poljubnem razporedu podatkov. 4. EKSPERIMENTALNO DELO Prava vrednost vsakega izdelka se seveda pokaže med njegovo uporabo, zato smo oba algoritma testirali z različnimi učnimi podatki. Zaradi lažje primerljivosti z drugimi algoritmi in z 1 2 3 4 9 11 drugimi inteligentnimi sistemi smo uporabili uveljavljene učne podatke. Začeli smo z logično funkcijo ekskluzivni ALI (XOR), ki je ena od najmanjših učnih množic, a je vseeno zanimiva zaradi 10 12 izrazitega lokalnega minimuma. Oba algoritma sta se izkazala 5 6 7 8 izvrstno. Vzvratno razširjanje je potrebovalo za isto nalogo petindvajsetkrat več iteracij kot algoritem bipropagation. Rezultati temeljijo na povprečju množice poskusov. BPM je brez težav našla dve mejni črti in izvedla eno logično operacijo IN v enem koraku. Ponavljanje poskusa ni potrebno, ker tu ni naključnih začetnih vrednosti, ki bi vnesle dejavnik sreče v poskus. Slika 4. Mejni pari Naslednja učna množica so bile Perunike (Iris) [8]. To so štiri dimenzionalni podatki, ki vsebujejo tri razrede in 150 učnih Dvoglave puščice ponazarjajo mejne pare, ki jih vzorcev. BPM je že pred pričetkom učenja izločila 146 od 150 sečejo mejne premice učnih vzorcev in nalogo rešila z enim samim nevronom. Iz slike je razvidno, da lahko ista premica loči tudi pare 2, 3 in 4. Najzahtevnejši in najobsežnejši test je bil izveden z zbirko Tako ubijemo 4 muhe na en mah in dobimo prvo vodoravno podatkov MNIST, ki obsega slike 70 tisoč rokopisnih števk v premico. Druga vodoravna premica loči pare 5, 6, 7 in 8. Z dvema ločljivosti 28*28 slikovnih točk (784 dimenzionalni problem), v 8 navpičnima premicama ločimo še ostale štiri pare in ugotovimo, bitni barvni ločljivosti (256 odtenkov sive barve) [9]. Metoda da so skupno potrebne štiri premice. Med postavljanjem premice vzvratnega razširjanja tukaj popolnoma odpove, bipropagation pa učimo en nevron z udeleženci mejnih parov. Tako dobimo 4 najde rešitev v zgolj nekaj 10 iteracijah. BPM je bila preizkušena nevrone v prvem sloju in s tem je znana (skoraj) minimalna le z delnim naborom podatkov MNIST zaradi omejitev strojne zgradba celotnega MLP. Za računanje razdalje med točkami opreme. Rezultat dosežen z delnim naborom MNIST je bil (polmer krogov) in za ugotavljanje ali par meji, uporabljamo zgolj podoben rezultatom, ki jih drugi algoritmi dosegajo s celotnim evklidsko razdaljo, ki je določljiva v prostoru s poljubnim naborom učnih vzorcev. Pri nobenem od obeh obravnavanih številom dimenzij. Tako lahko rešujemo probleme v poljubno algoritmov nismo zaznali nobenih omejitev glede uporabnosti. dimenzionalnem prostoru. Krožnice v tridimenzionalnem prostoru Oba sta primerna za razvrščanje poljubnih podatkov. nadomestijo sfere, v še več dimenzionalnem prostoru pa hipersfere. V redkih primerih se zgodi, da že ločimo vse mejne 5. ZAKLJUČEK pare, a ostane kakšno področje še vedno nehomogeno. V tem Uvodoma smo se seznanili s težavami, ki jih srečamo med primeru se lotimo iskanja dodatnih mejnih parov, a tokrat le še v učenjem nevronskih mrež tipa FFANN oziroma MLP: počasnost, nehomogenem področju. Tako smo med učenjem uporabljali le 20 nezanesljivost, nenatančnost in nekonstruktivnost. V ta namen sta od 36 pikslov. Učili smo le 4 nevrone od skupno 7 in še te 4 nastala dva nova algoritma: »bipropagation« in »metoda mejnih vsakega posamično. Posebej pomembno pri tem pa je, da ta parov« (BPM). Izkazalo se je, da oba nova algoritma odpravljata algoritem v vsakem primeru najde rešitev. V primeru, ko so učni mnoge pomanjkljivosti vzvratnega razširjanja in ne prinašata podatki več kot dvodimenzionalni, si vhodnega prostora ne nobenih novih. Tukaj izpostavljamo konstruktivnost in ne moremo narisati in se zato postopek malo spremeni. Vse sloje iterativnost algoritma BPM. Algoritem BPM je bil do sedaj 38 objavljen le v različici za razvrščanje, v razvoju pa je že tudi International Electrotechical and Computer Science različica za regresijo. Zelo koristna za uveljavljanje obeh novih Conference ERK 2009, Slovenian section IEEE, pp 199-202, algoritmov bi bila izdelava programske opreme zanju. Bodisi 2009. njuna vključitev v uveljavljena programska orodja na področju [6] Bojan Ploj: Advances in Machine Learning Research, 3rd strojnega učenja, ali pa izdelava njunih knjižnic v uveljavljenih chapter, Nova Science Publishers, 2014, New York. programskih jezikih. [7] Bojan Ploj, Milan Zorman, Peter Kokol: Border Pairs 6. REFERENCE Method – Constructive MLP Learning Classification Algorithm, Adaptive and Intelligent Systems,Second [1] Hebb, Donald Olding, The Organization of Behaviour: A International Conference, ICAIS 2011, Klagenfurt, Austria, Neuropsychological Theory, 1949. September 6-8, 2011. Proceedings [2] AI winter, Wikipedija, [8] Iris data set, https://www.wikiwand.com/en/AI_winter, 15. september http://en.wikipedia.org/wiki/Iris_flower_data_set, 12. 1. 2016. 2013 [3] Davida E. Rumelharta, Geoffreya E. Hintona in Ronalda J. [9] Yann LeCun, Corinna Cortes: yan.lecun.com/exdb/mnist, Williamsa, Learning representations by back-propagating MNIST , Handwritten digit database errors, Nature, October 1986. [10] Bojan Ploj: Bipropagation Demo App, [4] Irie, Miyake; Capabilities of three-layered perceptrons, IEEE https://www.mathworks.com/matlabcentral/fileexchange/553 International Conference on Neural Networks, 1988. 89-bipropagation, 10. 9. 2016. [5] B. Ploj, Bipropagation - nov način učenja večslojnega perceptrona (MLP), Proceedings of the Eighteenth 39 Ebook Recommendations Based on Stylometric Features Lovro Žitnik and Marko Robnik-Šikonja University of Ljubljana, Faculty of Computer and Information Science Večna pot 113, 1000 Ljubljana, Slovenia lovro@zitnik.com, marko.robnik@fri.uni-lj.si ABSTRACT rational content-based filtering (CB) recommender system. We present a recommender system for Slovene ebooks using This approach could be used to solve the cold-start problem, stylometric features and machine learning. Recommender where recommendations are required for a new user, and systems have become a standard component of modern web other situations with inadequate data for a collaborative fil- platforms and a popular way for users to find novel items tering (CF). Using the CB we could also recommend books or services. With an increasing number of available Slo- similar to only one designated book, which greatly reduces vene ebooks and growing popularity among users, we were required input of new users. motivated to build a tool that helps them find their next book. As a data set we use books in the ePub format, ano- For the CB we use only the content of ebooks to analyze the nymized list of users from an e-library system and a set of writing style of the authors, and do not use other metadata transactions between books and users. The textual content (e.g. information about authors, literary style, keywords of books is tokenized, POS-tagged and lemmatized. Nume- etc). In pre-processing we use natural language processing rical stylometric features of books are extracted and evalu- techniques and apply statistical methods to obtain 92 nu- ated by unsupervised feature ranking methods SPEC and merical stylometric features for each book. The examples of Laplacian score. The reduced feature set describing books raw features include average number of syllables in a sen- and users is used in clustering and classification tasks, inclu- tence, the percentage of words in direct speech, the percen- ding one-class. Several variants of recommender algorithms tage of common and proper nouns, average concreteness of are presented, based on both content and collaborative fil- the words etc. Additionally, we compute several well-known tering. The automatic and user evaluations show that the readability metrics. We evaluate the importance of features content-based approach with stylometric features is feasi- using unsupervised measures Laplacian score and SPEC. ble, however, the collaborative filtering gives better overall performance. We perform a dimensionality reduction and clustering on the stylometric features of books to reduce noise and produce an Categories and Subject Descriptors input data set for five CB algorithms. The list of users and H.4 [Information Systems Applications]: Miscellane- transactions are processed in the same way to produce two ous; D.2.8 [Software Engineering]: Metrics—complexity data sets as input for two CF algorithms. The first data set measures, performance measures uses only profile data, whereas book rankings are added to the second. Keywords In related work Fleischman et al [5] use natural language stylometric features, ebooks, recommender systems, machine processing (NLP) techniques for recommendations, but do learning, Slovene language, clustering, classification not use stylometry. Vaz et al [12] use writing style and nega- tive examples to prototype a working recommender system. 1. INTRODUCTION Recommender systems (RS) suggest an ordered list of items The paper is based on [16] and consists of six Sections. In to users by predicting their relevance. In our case, an algo- Section 2 we present stylometric features. Section 3 descri- rithm suggests a personalized list of recommended ebooks. bes data sets used to train the algorithms described in Sec- The motivation behind building such a tool were an increa- tion 4. Section 5 contains the evaluation and comparison sed number of Slovene ebooks, growing number of users that of developed approaches. In Section 6 we summarize the demand easier decision-making and the fact that there was conclusions and present guidelines for further work. no available RS for Slovene ebooks. Stylometry is defined as a statistical analysis of variations 2. STYLOMETRIC FEATURES in a literary style for writers or genres. When the preferred Stylometric features (SF) are numerical descriptions of a literary style of a user can be determined, books of the same book’s writing style. They are derived from the content of style could make usable recommendations. a book. In our approach we tokenize books into chapters, paragraphs and sentences. Stopping at the sentence level is The aim of our work is to test the hypothesis that stylome- necessary to preserve the context of each separate word for tric features, derived only from the text, suffice for an ope- a more precise classification. We perform the lemmatization 40 and POS-tagging, where each word or punctuation is clas- Users Transactions Books sified as an adjective, verb, adverb, comma etc. Publicly available Slovene document corpus ccGigafida [4] is used for CSV JSON Raw data learning. We tested several taggers for speed and accuracy 10813 users 2151834 transactions 1160 ePub 323 PDF and choose Brill tagger [1]. 1147 useful + # - * - - * * # From the pre-processed data we derive different SF. Sim- + ? ? + i ple features include percentage of sentences with different Imputation Interpretation Metadata Raw text number of commas, number of punctuation marks, average of transactions number of syllables per word or sentence etc. More advan- Tokenisation + - + - ced features include overall percentage of the quoted words + - One-hot or number of sentences with direct speech. A special care Positive Negative Sentences and words Feature extraction Encoding ratings ratings was taken to preserve pronouns during the analysis. They Processing POS-tagging are usually removed from the analysis via stop-words lists, Lemmatization because they are not relevant for analysis based on content. Statistical methods Pennebaker [10] advocates that pronouns play a vital role in determining writing style. Our analysis confirms this, as < 1, 2, 3 .. 28 > < 1, 2, 3 .. 1175 > < 1, 2, 3 .. 92 > the feature ”Percentage of pronouns”is ranked third by the User feature vector nr.1 User feature vector nr.2 Book feature vector (User profile) (User book profile) Laplacian score [6] and fifth by the SPEC algorithm [14] in a list of total 92 stylometric features. The scores of all SF are available in [16]. Saving to database Figure 1. The diagram of feature extraction process. We compute several readability metrics derived for English language as described by Mailloux et al [8]. While their 1 to 5). In our case the ratings are implicit and have to transfer to Slovene language might be questionable, they be derived from users’ interaction with the e-library web were used previously [15] and our tests in Table 1 show application. The transaction ”Confirm rental” was used as that they are strongly correlated to the text comprehen- the best approximation for positive rating of the book. The sibility. The average word concreteness and average age of best candidate for negative rating was the transaction ”View word acquisition are also used features, based on [7] and [2]. of the book page” occurring only once, without any further interactions with the same book. Table 1. Average readability metrics for different book ca- tegories. Columns report the Gunning fog index (GFI), Fle- The ratings for all books were added to the database for all sch reading-ease score (FRES) and Flesh-Kincaid grade level the users for the purpose of CF. The attributes for all the (FKGL). books contain the value 1 for the positive rating, value -1 for the negative rating, and value 0 if no rating was extracted category sample size GFI FRES FKGL from transactions. This data set is called ”User book profile” E-books for Kids 81 7,26 39,76 10,47 (UBP). E-books for Teens 25 8,41 33,63 11,68 Romance 69 7,91 35,68 11,15 Poetry and Drama 161 8,22 34,21 11,80 For each of the three data sets (books stylometric features, Natural Science, Technology, Math 5 12,04 10,88 15,93 user profile and user book profile) we computed several de- Humanities and Social Sciences 96 13,17 10,13 16,63 rivative data sets with dimensionality reduction techniques, such as PCA and t-SNE. The DBSCAN and k-means algori- 3. DATABASE thms were used for clustering, where the number of centers To form stylometric data sets we used 1047 digitalized ebo- k was determined by optimization of Silhouette score. The oks in ePub format. We obtained anonymized user profiles visualization of one such end result is available in Fig. 2. of 10,813 users of a Slovene e-library system and 2,151,834 transactions between users and books. We extracted and 4. RECOMMENDER SYSTEM consolidated the data into a unified MongoDB database. The process is depicted in Fig. 1 and results in three data Using the data described in the previous Section seven diffe- sets. rent recommendation algorithms were developed: five content- based filtering (CB) algorithms and two collaborative filte- As a result of the feature extraction process, we obtain the ring (CF) algorithms. data set with 92 numerical stylometric features for each book. We assume that the ratings extracted from the implicit user transactions are of two significantly different quality levels, The ”user profile” (UP) consists of a gender, year of birth, i.e., positive ratings as more reliable than negative ratings. education and current status of each user. To impute mis- We describe the tested algorithms using different approaches sing values for some users, we use model-based imputation and different subsets of data below. based on naive Bayes classifier. 4.1 Content-based filtering The biggest challenge was to extract positive and negative The average book algorithm (CNTR) calculates the the book ratings from the transactions. In recommender sy- average book (i.e. the centroid) from the positively rated bo- stems usually explicit user ratings of items are available, oks of a user and uses nearest neighbor techniques proposed where each user rates some items with a score (e.g., from by Sarwar et al [11] to find recommended books. 41 5. EVALUATION We used two types of evaluation: offline testing and user preference testing. With offline testing, we evaluated the prediction accuracy of algorithms by using the existing data. 20% of positi- vely rated books were hidden and the other 80% is used for learning (i.e. input to recommendation algorithms). By measuring how many books from the hidden set the recom- mendation algorithms managed to retrieve we evaluate their precision and recall. The distribution of positively rated books has a strong long- tail characteristics (see Fig. 3). To avoid imbalanced distri- bution, we divided users into 10 groups (G1 - G10), depen- ding on a number of positively rated books (2, 3, 4, 5, 7, 10, 18, 28, 50, or 100) and tested each group separately. Number of users with same number of positively rated books 10000 Figure 2. Clustering of books with k-means (k=12, Silhou- ette optimized) in 10-dimensional space, additionally redu- ced to 2-dimensional space with t-SNE. 1000 The nearest books algorithm (NEAR) loops through 100 the list of positively rated books, adding the next nearest book to the list of recommended books. Number of users 10 The binary classification algorithm (BIN) is the only CF algorithm that uses the negative ratings. The positive 1 and negative ratings of books for a user are used as a train- 0 5 10 15 20 25 30 35 40 46 51 58 64 69 79 91 104 126 141 158 ing data for probabilistic binary classifier, which is used for Number of positively rated books classification of new books. The returned probability scores Figure 3. Number of users (logarithmic scale) in sets with are used to sort the list. same number of positively rated books. The one-class SVM algorithm (SVM) is a modified ver- As a baseline, we included a random book recommendation sion of Support Vector Machine, using learning from only (RAND). All seven algorithms were tested with several di- one class. We discard the negatively rated books and only fferent settings (different type and level of dimensionality learn from positively rated books. We use the resulting clas- reduction, metrics, classifiers etc.). The precision scores for sifier on a new unrated set of books. This approach is descri- the best settings of each algorithm are presented in Fig. 4 bed by Manevitz et al [9] and used by Yu et al [13], where and 5. For details, see [16]. it is reported to have a poor performance. AVERAGE PRECISION BY GROUP The PU algorithm (PU) uses two sets of books from a RAND CNTR NEAR BIN SVM PU CFUPO CFUBP particular user, the positively rated books and the unrated 0.12 books. From the set of unrated books it tries to single out 0.1 the books into the negatively rated books set. In this way 0.08 the problem is transformed into a binary classification pro- blem. we use the algorithm proposed by Elkan et al [3]. 0.06 0.04 4.2 Collaborative filtering 0.02 CF algorithms do not use stylometric features. Instead they 0 rely on the history of users’ transactions. We form two clas- G1 G2 G3 G4 G5 G6 G7 G8 G9 G10 sifiers. Figure 4. Average precision by group. OVERALL AVERAGE AND MEDIAN OF PRECISION The user profile only algorithm (CFUPO) first clusters average median users based only on their user profile, then it finds the ne- 0.025 arest user in the same cluster and recommends the books 0.02 of that user, removing the books already rated positively to 0.015 avoid duplication. 0.01 0.005 The user book profile algorithm (CFUBP) differs from 0 the CFUPO only by using the user book profile. RAND CNTR NEAR BIN SVM PU CFUPO CFUBP Figure 5. Overall average and median of precision. 42 The results show that all the algorithms outperformed the Natural Language Processing, ANLC’92, pages baseline random algorithm. CFUPB, which uses user book 152–155, 1992. profile, scores best, followed surprisingly by SVM, which [2] M. Brysbaert, A. B. Warriner, and V. Kuperman. contradicts low expectations expressed in [13]. We assume Concreteness ratings for 40 thousand generally known that the worst score for BIN is caused by lack of data for tra- English word lemmas. Behavior research methods, ining, whereas other CB algorithms score surprisingly well 46(3):904–911, 2014. compared to CF. [3] C. Elkan and K. Noto. Learning classifiers from only positive and unlabeled data. In Proceedings of the 14th In evaluation of user preferences we assigned 9 users a ACM SIGKDD International Conference on task to select 10 books they like and 5 they do not like. Knowledge Discovery and Data Mining, KDD ’08, The age, birth year, education and current status of the pages 213–220, 2008. tested persons were also collected. WE supplied this input [4] T. Erjavec and N. Logar Berginc. Referenčni korpusi to all 8 recommendation algorithms and assembled a list of slovenskega jezika (cc)Gigafida in (cc)KRES. In recommended books for each user. A visual user interface T. Erjavec and J. Žganec Gros, editors, Zbornik Osme to rank algorithms from 1 to 8 was provided. The first three konference Jezikovne tehnologije, pages 57–62, recommendations were marked with gold, silver and bronze Ljubljana, 2012. medals. The average rank and its standard deviation are [5] M. Fleischman and E. Hovy. Recommendations presented in Fig. 6. without user preferences: A natural language processing approach. In Proceedings of the 8th AVERAGE RANKING AND STANDARD DEVIATION OF RANKING International Conference on Intelligent User average ranking stand. deviation of ranking Interfaces, IUI ’03, pages 242–244, 2003. 7 6 [6] X. He, D. Cai, and P. Niyogi. Laplacian score for 5 feature selection. In Y. Weiss, B. Schölkopf, and J. C. 4 Platt, editors, Advances in Neural Information 3 Processing Systems 18, pages 507–514. 2006. 2 [7] V. Kuperman, H. Stadthagen-Gonzalez, and 1 0 M. Brysbaert. Age-of-acquisition ratings for 30,000 CFUBP CFUPO NEAR PU CNTR BIN RAND SVM English words. Behavior Research Methods, Figure 6. The algorithm ranking by users. 44(4):978–990, 2012. [8] S. L. Mailloux, M. E. Johnson, D. G. Fisher, and T. J. The results show that users prefer CF recommendations, Pettibone. How reliable is computerized assessment of which took first two places, before all CB algorithms. Sur- readability? Computers in nursing, 13:221–221, 1995. prisingly, the random algorithm ranked second to last. This [9] L. M. Manevitz and M. Yousef. One-class SVMs for could be attributed either to the users desire for novel books document classification. J. Mach. Learn. Res., or to a noise, since users reported that ranking lower ranked 2:139–154, Mar. 2002. suggestion was difficult. [10] J. W. Pennebaker. The Secret Life of Pronouns: What Our Words Say About Us. Bloomsbury USA, 2011. 6. CONCLUSIONS [11] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. We propose the content-filtering recommender system for Application of dimensionality reduction in books using only stylometric features. We propose seve- recommender system-a case study. Technical report, ral content-based filtering algorithms using only numerical DTIC Document, 2000. stylometric features extracted from book content. We ben- [12] P. C. Vaz, D. Martins de Matos, and B. Martins. chmark them against two collaborative filtering algorithms Stylometric relevance-feedback towards a hybrid book and random recommendations. recommendation algorithm. In Proceedings of the Fifth ACM Workshop on Research Advances in Large The results of both automatic evaluation and user preferen- Digital Book Repositories and Complementary Media, ces indicate that recommender systems using only stylome- BooksOnline ’12, pages 13–16, 2012. tric features are feasible, but they are inferior in performance [13] H. Yu, W. Zuo, and T. Peng. A new pu learning compared to collaborative filtering recommendations. algorithm for text classification. In Mexican International Conference on Artificial Intelligence, We assume that a significant improvement of content-filtering pages 824–832, 2005. algorithms could be obtained by gathering explicit ratings of [14] Z. Zhao and H. Liu. Spectral feature selection for books. Incorporating the word-space similarity as described supervised and unsupervised learning. In Proceedings in [5] could also prove beneficial. For a real-world book re- of the 24th International Conference on Machine commender system we recommend development of a hybrid Learning, ICML ’07, pages 1151–1157, 2007. algorithm that combines both content-based and collabora- [15] A. Zwitter Vitez. Ugotavljanje avtorstva besedil: tive filtering approaches. This could solve the cold start primer ”trenirkarjev”. In T. Erjavec and problem. J. Žganec Gros, editors, Zbornik Osme konference Jezikovne tehnologije. Ljubljana, 2014. 7. REFERENCES [16] L. Žitnik. Recommender system for slovene electronic [1] E. Brill. A simple rule-based part of speech tagger. In books. University of Ljubljana, Faculty of Computer Proceedings of the Third Conference on Applied and Information Science, 2016. Diploma thesis. 43 Model selection on the JSI grid: Metis use-case Jernej Zupančič Damjan Kužnar Matjaž Gams “Jožef Stefan” Institut “Jožef Stefan” Institut “Jožef Stefan” Institut Jamova cesta 39 Jamova cesta 39 Jamova cesta 39 Ljubljana, Slovenia Ljubljana, Slovenia Ljubljana, Slovenia jernej.zupancic@ijs.com damjan.kuznar@ijs.com matjaz.gams@ijs.com ABSTRACT 3. Gradients are usually not available. The paper presents model selection using a random search method on the Jožef Stefan Institute (JSI) grid infrastruc- ture. Best practices for model selection on the grid are Additionally, in real life scenarios one does not have the presented through the use-case, where the dataset from the access to the ground truth distribution Gx, therefore, the Metis project was used. The model selection task was to find value that we wish to optimize, i.e. expectation over the the best learning algorithm and its parameters according to unknown distribution, is impossible to evaluate. In order to the area under the curve performance metric. Although the mitigate the problem of Gxs unavailability, cross-validation gradient boosting algorithm performed best in the experi- is often used. In cross-validation the expectation is replaced ment, logistic regression was chosen for the implementation with a mean loss over validation set Dvalid. Cross-validation in the Metis prediction module due to the high model trans- is unbiased as long as Dvalid is independent of the data used parency and good performance. by Aλ [4]. Therefore, the problem 1 is approximated by the problem 1. INTRODUCTION A∗, λ∗A ≈ argmin mean L (x; A A∈A,λ∈Λ x∈Dvalid λ(D)). (2) The objective of a learning task T = (L, D) is to find an algo- rithm A and the algorithm parameters λ so that the function Since the problem 2 typically involves an evaluation of sev- f = Aλ (D), which is the result of the use of training algo- eral (A, λ) pairs, which in turn usually require an inner opti- rithm on a dataset, minimizes some expected loss L(x; f ) mization run, the problem is computationally intensive. The over i.i.d. samples x from a ground truth distribution Gx. use of a cluster of computers is therefore required in order to Algorithm A is a functional that maps a dataset D, which obtain satisfiable solutions in a reasonable time frame. JSI is a finite set of samples from Gx, to f . Algorithm A usually grid1 was used in the presented experiments. performs an optimization of some loss metric with respect to parameters θ that correspond to the internal representation The rest of the paper is structured as follows. In section 2 of f by A (e.g. weight values in a neural network). Addi- related work in model selection and hyper-parameter tuning tionally, the algorithm A often has adjustable parameters is presented, in section 3 the experimental setup is described, λ called hyper-parameters (e.g. number and sizes of hidden in section 4 the results from the experiment are presented, layer in a neural network). Those hyper-parameters often and section 5 concludes the paper. determine the performance of the algorithm and their setting is crucial for obtaining an algorithm that effectively learns a function f on the training dataset D and performs well 2. RELATED WORK on new and previously unseen samples from G Model selection and hyper-parameter tuning problems are x. Therefore, an algorithm and its settings that minimize generalization usually addressed by manual search and grid search or a error combination of the two (e.g. [10, 6]). Although several model and hyper-parameter optimization methods have been intro- Ex∼G [L(x; A x λ(D))] duced in the recent years such as sequential model based op- are required. The problem of finding the best model and its timization [7, 3] or efficient global optimization algorithm [8, parameters with respect to the generalization error is called 1] researchers and practitioners still prefer a combination of model selection, while the problem of finding only the best manual and grid search. This combination does have some parameters for a chosen algorithm is called hyper-parameter nice properties: tuning. This paper presents methods (and an application of one such method) to address the model selection problem: 1. It enables the practitioner to build the intuition about A∗, λ∗ the performance of the models and their parameters. A = argminA∈ E [L(x; A A,λ∈Λ x∼Gx λ(D))]. (1) The optimization problem 1 is hard due to several reasons: 2. It enables the practitioner to focus on models and set- tings that are most promising. 1. The problem is usually non-convex. 3. It is effective in lower dimensions. 2. Search space is complicated and hierarchical. 1https://www.ijs.si/ijsw/nsc/Uporaba%20gru%C4%8De%20IJS 44 4. Grid search is simple to implement and embarrassingly – max depth ∈ {5, 8, 15, 25, 30, N one} parallelizable. – min samples split ∈ {1, 2, 4, 10, 15, 100} – min samples leaf ∈ {1, 2, 5, 10} However, it has the disadvantage of not being reproducible – max features ∈ {0log20,0 sqrt0, N one} and not being effective in the cases of large search spaces and higher dimensions. • KNeighborsClassifier: State-of-the-art approaches for model selection and hyper- – n neighbors ∈ {0l01,0 l02} parameter tuning are based on sequential model based opti- – p ∈ {0.001, 0.01, 0.1, 1, 10, 50, 100} mization approaches and have already been implemented in • several software libraries such as AutoWEKA2, AutoSklearn3, SVC: Spearmint4, Hyperopt5, etc. The problem with the sequen- – C ∈ {0.001, 0.01, 0.1, 1, 10, 100, 1000} tial model based optimization methods is that they are not – simple to parallelize and are difficult to implement. In the gamma ∈ {0auto0, 0.01, 0.05, 0.1, 0.2} parallel implementation the authors usually choose a trade- – class weight ∈ {0balanced0, N one} off between effective modelling of the problem and the time • XGBClassifier: speed-up enabled by the use of the cluster [3]. – learning rate ∈ {0.01, 0.025, 0.05, 0.1} Although not state-of-the-art any more the random search – is another method for model selection and parameter tuning n estimators ∈ {120, 300, 500, 800, 1200} which performs very well on high dimensional problems [2]. – gamma ∈ {0.05, 0.3, 0.7, 1.} Additionally, random search is simple to implement and par- – max depth ∈ {3, 6, 12} allelize, while being much more efficient than a grid search. – min child weight ∈ {1, 3, 7} 3. EXPERIMENTAL SETUP – subsample ∈ {0.6, 0.8, 1.} Random search method for model selection was implemented – colsample bytree ∈ {0.6, 0.8, 1.} for a use in a multi-node setting. Each worker ran indepen- – reg lambda ∈ {0.01, 0.025, 0.05, 0.075, 0.1, 1.} dently and it followed the following steps: – reg alpha ∈ {0, 0.1, 0.25, 0.5, 0.75, 1.} 1. Set-up an environment on a computational node. The search space includes 235 538 possible combinations. 2. Sample from available algorithms and corresponding Taking into account the time required to build the model parameters. in the case of random forest or tree boosting (about 1 hour as observed from the log files) a full grid search was not 3. Evaluate the performance of a chosen model. feasible. 4. Report the performance of the model to the central node. 3.1 The data The data about student performance analyzed in the Metis 5. Repeat from step 2. project was used to provide a use-case. The data includes the same features as presented in [9], however, the informa- tion from more students was added. In the use-case the in- Learning algorithms available in established Python 3.5 li- formation of 70 155 unique students at four subjects (Math- braries Scikit-learn [11] (logistic regression, random forest, ematics, Slovene, English and Chemistry) was used, which naive bayes, nearest neighbours, support vector classifier) translates to 1 736 979 training instances. Each instance and XGBoost [5] (tree boosting) were selected as possible represents a state of the pupil at the end of a month (e.g. algorithms. MongoDB6 was used to store the results. The October) at one subject (e.g. Mathematics). This should following parameter options (according to the algorithm im- result in 10 instances for one pupil for one subject per a plementation) were available: school year, however, since the data for December was not available only 9 instances per school year were available. A • LogisticRegression: total of 32 features was constructed taking into account the grades in current and previous grading periods and simple – penalty ∈ {0l01,0 l02} statistics derived from the grades (min, max, mean, devi- – C ∈ {0.001, 0.01, 0.1, 1, 10, 50, 100} ation, difference), the number of excused and not excused absences, time remaining to the end of the grading period, • RandomForestClassifier: and the number of times a pupil was missing from the ex- amination. The goal was to predict whether a pupil will – n estimators ∈ {120, 300, 500, 800, 1200} have a negative grade at the end of the grading period. The 2http://www.cs.ubc.ca/labs/beta/Projects/autoweka/ dataset included 114 414 of positive instances (i.e. student 3http://automl.github.io/auto-sklearn/stable/ had a negative grade at the end of the grading period) and 4https://github.com/HIPS/Spearmint 1 622 565 of negative instances (i.e. student had a positive 5http://hyperopt.github.io/hyperopt/ grade at the end of the grading period). Due to sensitive 6https://www.mongodb.com/ personal information the data is not publicly available. 45 3.2 Model evaluation GaussianNB We have used an area under the curve (AUC), standard met- ric for the evaluation of binary classification models, with LogisticRegression few adjustments. To evaluate a model, instances were first divided by month. The 5-fold cross validation was used on RandomForestClassifier the instances of the same month to get a cross validation probability prediction for every instance. Using the pre- model XGBClassifier dicted probability for every instance AUC metric was cal- culated for every month, which was the result of the model KNeighborsClassifier evaluation. SVC 3.3 The grid 0.80 0.82 0.84 0.86 0.88 0.90 0.92 0.94 0.96 JSI computing cluster, which is a part of SLING - Slovenian mean_scores initiative for national grid, is accessed through the member- ship at nsc.ijs.si virtual organization. The cluster enables Figure 1: A box-plot of scores grouped by a model. the submission of batch jobs using the Advanced Resource Connector (ARC)7 middleware and has 1 984 CPUs avail- able for computational tasks. Every researcher also has the right to be a member of the gen.vo.sling.si virtual orga- 4.2 The grid experience nization used for submitting jobs to common Arnes cluster, Although the availability of the grid is a big bonus for the however, this cluster is usually occupied and the times for JSI researchers, it still has some issues that have to be taken the jobs to start and eventually finish are much longer than into account when writing software that is to be run on the using the JSI cluster. grid. The connection between nodes in a grid is not avail- 4. RESULTS able (at least through Python and SSH). This makes the deployment of software designed for multiple nodes not 4.1 Metis use-case results possible. The experiment was run on 20 computational nodes in the grid for 24 hours. We have obtained 112 model evaluations. When an error occurs in a job the whole job is can- The reason for such a low number of successful model evalu- celled. Even when multiple processes are run on one node, ations was the failure of computing nodes on the grid, which it the program in the process results in an error, the whole will be explained in more details in section 4.2. As described job is terminated. in 3.2 evaluation values were obtained for every month dur- ing the school year. In order to compare between them a Only batch jobs are possible. While some problems are mean of those values was calculated for each model eval- appropriate for a batch job grid usage type, others are not. uation score. It was observed (Figure 1) that XGBClas- Especially the data mining problems, where the practitioner sifier was performing best no matter the parameter values tries several approaches and requires instant feedback in or- used (mean score of 0.939 AUC), followed by RandomForest- der to proceed with the next step. Classifier (mean score of 0.937 AUC) and LogisticRegression (mean score of 0.935 AUC). All results are presented in Fig- The following best practices are proposed when working with ure 2. GaussianNB model was evaluated only once, since it the JSI grid: does not require the setting of parameters, while SVC model was also evaluated only once, however, that was due to the Parallelize your code so that each worker can work fact that all other attempts to evaluate the SVC model on on independent task. It would be best if no communi- the given data resulted in error. cation between workers is required, the only communication would be the reporting of results to your local database, Although the XGBClassifier was performing best, we have which is accessible to the grid inside the JSI. This will en- chosen LogisticRegression for the use in the Metis prediction able you to launch as many workers as you wish. module, since it is a transparent model and it performs sim- ilar to the best performing XGBClassifier (when parameters Use only one CPU per worker/job. In the case of error are set to the values found in the search that perform best you will loose only one computation node. in case of LogisticRegression - penalty =0 l01, C = 1). Use local testing environment for testing the scripts We have also ranked the models according to the Pareto you are about to submit as a job to the grid. Best, front dominance (Figure 3). Although the first front does use container technologies (e.g. Docker8) to start a fresh present some valuable insight (only XGBClassifier and Ran- Linux environment (as used by the grid), transfer all the domForestClassifier appear on the first front and are there- required scripts and data to the environment and perform a fore non-dominated) the information from other second and test (e.g. in a case of data mining task a test could be done other fronts does not provide additional information, since with smaller amount of data). the fronts strive for diversity. 7http://www.nordugrid.org/arc/about-arc.html 8https://www.docker.com/ 46 1.0 0.8 0.6 GaussianNB KNeighborsClassifier 0.4 LogisticRegression mean_scores RandomForestClassifier 0.2 SVC XGBClassifier 0.0 0 20 40 60 80 100 sorted_index Figure 2: Mean scores over all months for all evaluations. 18 6. REFERENCES model 16 [1] T. Bartz-Beielstein and S. Markon. Tuning search GaussianNB algorithms for real-world applications: A regression 14 KNeighborsClassifier LogisticRegression tree based approach. In Evolutionary Computation, 12 RandomForestClassifier SVC 2004. CEC2004. Congress on, volume 1, pages 10 XGBClassifier 1111–1118. IEEE, 2004. 8 [2] J. Bergstra and Y. Bengio. Random search for hyper-parameter optimization. Journal of Machine 6 appearance_in_front Learning Research, 13(Feb):281–305, 2012. 4 [3] J. S. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl. 2 Algorithms for hyper-parameter optimization. In 0 Advances in Neural Information Processing Systems, 0 1 2 3 4 5 6 7 8 9 10 front pages 2546–2554, 2011. [4] C. M. Bishop. Neural networks for pattern recognition. Oxford university press, 1995. Figure 3: Models ranked by Pareto dominance. [5] T. Chen and C. Guestrin. Xgboost: A scalable tree boosting system. arXiv preprint arXiv:1603.02754, 2016. Use scripts for submitting the jobs and obtaining [6] G. Hinton. A practical guide to training restricted the logs. boltzmann machines. Momentum, 9(1):926, 2010. [7] F. Hutter. Automated configuration of algorithms for 5. CONCLUSION solving hard computational problems. PhD thesis, The paper presents the model selection and hyper-parameter University of British Columbia, 2009. tuning problems. Since the problem is computationally de- [8] D. R. Jones, M. Schonlau, and W. J. Welch. Efficient manding JSI grid infrastructure was used in combination global optimization of expensive black-box functions. with a simple random search method for optimization in Journal of Global optimization, 13(4):455–492, 1998. complex search space. Methods were applied to the Metis [9] D. Kuznar and M. Gams. Metis: system for early use-case, where the goal was to predict whether a pupil will detection and prevention of student failure. In have a negative grade at the end of the grading period. Using 6thInternational Workshop on Combinations of random search on the grid we have found a well performing Intelligent Methods and Applications (CIMA 2016), white-box model that will be applied in production. page 39, 2016. [10] H. Larochelle, D. Erhan, A. Courville, J. Bergstra, The experience in working with the JSI grid infrastructure and Y. Bengio. An empirical evaluation of deep was also described. It was found that connectivity issues architectures on problems with many factors of between nodes, resource issues on the nodes themselves, and variation. In Proceedings of the 24th international batch job submissions prevent the user from fully exploiting conference on Machine learning, pages 473–480. ACM, the grid resources. However, the grid can still be exploited 2007. by following the proposed best practices. By increasing the [11] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, awareness of the grid availability among the JSI researchers B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, we expect that several issues that are now present will be R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, removed in the future. This will enable JSI researchers to D. Cournapeau, M. Brucher, M. Perrot, and use the grid to its full potential and give the JSI advantage E. Duchesnay. Scikit-learn: Machine learning in in national and international research projects. Python. Journal of Machine Learning Research, 12:2825–2830, 2011. All scripts and data for this paper are available at https: //repo.ijs.si/jernejzupancic/IS2016/tree/master. 47 Sinteza slovenskega govora na mobilni platformi Android Tomaž Šef Institut “Jožef Stefan” Jamova cesta 39 1000 Ljubljana +386 1 477 34 19 tomaz.sef@ijs.si POVZETEK verziji prav tako podpirata »on-line« sintezo, ki zahteva podatkovno povezavo, s to razliko, da so novo razviti glasovi, ki V članku predstavljamo mobilno aplikacijo in e-storitev, ki zagotavljajo občutno višjo stopnjo naravnosti rezultirajočega omogoča sintezo slovenskega govora na mobilni platformi sintetičnega govora, dosegljivi le preko eBralca On-Line. Android. Podpora se vgradi v sam operacijski sistem mobilne naprave, kar zagotavlja povezljivost s poljubnim programom nameščenim 2. E-STORITEV na tej napravi. Opisana rešitev je nepogrešljiva za E-storitev omogoča vgradnjo poljubnega Microsoft SAPI slepe in slabovidne ter osebe z motnjami branja, ker jim omogoča rabo različnih orodij za delo z mobilnimi kompatibilnega (slovenskega) sintetizatorja govora v operacijski napravami, učenje in samopomoč. Koristna sistem mobilnih naprav. E-storitev je sestavljena iz strežnika in je tudi za vse druge uporabnike mobilne mobilne aplikacije. platforme Android, saj je govor velikokrat najbolj primerna in naravna oz. včasih celo edina varna oblika posredovanja informacij Strežnik na eni strani omogoča povezavo s sintetizatorjem govora, in drugih vsebin. na drugi strani pa z mobilno napravo. Sintetizator govora je lahko Ključne besede naložen bodisi na istem strežniku kot sama strežniška aplikacija bodisi se poveže z nekim oddaljenim strežnikom z nameščenim Govorni bralnik besedil, sinteza slovenskega govora, mobilna sintetizatorjem govora (npr. s strežnikom eGovorec). Sintetizator platforma Android. govora lahko razvijalci sproti posodabljajo, ne da bi uporabniki za to sploh vedeli oz. jim za uporabo najnovejše različice ni potrebno 1. UVOD narediti čisto nič (ni nobene potrebe po prenosu sintetizatorja Razvili smo mobilno aplikacijo in e-storitev, ki omogoča sintezo govora, saj se ta posodobi na samem strežniku). slovenskega govora na mobilni platformi Android. Sistem je Mobilna aplikacija na platformi Android omogoča vgradnjo zasnovan v obliki strežnika v oblaku in pripadajočih aplikacij. glasov vseh v strežnik prijavljenih sintetizatorjev govora v Podpira industrijski standard Android Speech API, kar zagotavlja operacijski sistem mobilne naprave. V nastavitvenem meniju vključitev sintetizatorja govora v sam operacijski sistem in mobilne naprave se pri govornem bralniku avtomatsko pojavijo še posledično njegovo povezljivost s poljubnimi programi na tej slovenski glasovi. razširjeni mobilni platformi. Po kvaliteti nekoliko slabši sintetizator slovenskega govora se nahaja na sami mobilni napravi (off-line Arhitektura obeh razvitih e-storitev je podobna Tako DysLex različica) in omogoča branje tudi ob izklopljeni ali nedosegljivi kot eBralec KSS znata samodejno preklapljati med 3G in Wifi podatkovni povezavi. Bolj kakovosten in naraven govor je povezavo, pri čemer lahko rabo mobilne podatkovne povezave dosegljiv preko e-storitve v oblaku (on-line različica), ki za svoje (3G) tudi onemogočimo. Vgrajeni senzor prenosa podatkov na delovanje potrebuje podatkovno povezavo (3G ali Wifi) z mobilni napravi zagotavlja spremljanje in nadzor nad porabo oddaljenim govornim strežnikom. podatkovnega prenosa. Ob odsotnosti kakršnekoli podatkovne povezave sistema avtomatsko preklopita na (po kvaliteti sicer Opisana rešitev je nepogrešljiva za slepe in slabovidne ter osebe z slabšo) »off-line« sintezo govora. Ko je povezava z govornim motnjami branja, saj jim omogoča rabo različnih orodij za delo z strežnikom ponovno mogoča, pa se sinteza govora začne zopet mobilnimi napravami, učenje in samopomoč. Najbolj razširjena izvajati v boljši kvaliteti. programska oprema je že prilagojena njihovim specifičnim potrebam. Uporabna pa je tudi za vse druge uporabnike mobilne Arhitektura razvite e-storitve DysLex s strežniško aplikacijo v platforme Android, saj je govor velikokrat najbolj primerna in Arnesovem oblaku in pripadajočo mobilno aplikacijo naravna oz. včasih celo edina varna oblika posredovanja informacij (odjemalcem) na mobilni napravi je prikazana na sliki 1 in drugih vsebin (npr. branje elektronske pošte med vožnjo). (http://dis.ijs.si/dyslex/). Sintetizatorji govora se nahajajo bodisi v Arnesovem oblaku (MS SAPI 5 združljivi) bodisi na samostojnem Trenutno obstajata dve različici programa oz. e-storitve: strežniku (e-Govorec).  eBralec KSS (https://play.google.com/store/apps/details?id=si. ijs.dis. ebralec) je namenjen slepim in slabovidnim, osebam z Pri e-storitvi eBralec KSS se strežniška aplikacija in novo razviti motnjami branja ter zaposlenim v javni upravi, sintetizator govora eBralec nahajata izključno na strežniku  DysLex [1] (https://play.google.com/store/apps/details?id=si. Knjižnice slepih in slabovidnih. eBralec KSS zna dodatno ijs. dyslex) je brezplačna storitev, ki je splošno dostopna in upravljati z vgrajeno zaščito novega sintetizatorja govora, nastavlja vsem na razpolago. se lahko tudi različne parametre »Besednika« in »Analizatorja«. Androidna aplikacija eBralec KSS omogoča še vpis testnega Obe verziji vključujeta »off-line« sintetizator slovenskega govora, besedila, rabo odložišča ter posreduje (govorna) navodila o rabi ki deluje tudi takrat, ko podatkovna povezava ni na razpolago. Obe programa (slika 2). 48 DysLex ARNESOV strežniška eGovorec OBLAK aplikacija … DysLex Govorec Proteus FreeTTS mobilna ali FreeTTS (združljivi z MS SAPI 5) (strežnik spletna aplikacija v oblaku) Slika 1. Arhitektura e-storitve DysLex. E-storitev se zelo preprosto nadgrajuje in posodabljala. Rešitev  branje izbranega besedila v poljubni aplikaciji (za je horizontalno skalabilna, s čimer je omogočeno optimalno lektoriranje lastnega dela dislektikov [2], branje knjig, prilagajanje strojnih virov dejanskim potrebam oz. rasti števila uporabnikov in vsebin; omogočena je hitra vzporedna obdelava na izgovorjava posameznih besed, branje elektronskih večprocesorskih računalnikih, podprta je možnost namestitve n izročkov, dostopanje na internet), a grozd računalnikov (v oblaku). Sintetizatorje govora lahko  branje zaslona mobilne naprave, razvijalci posodabljajo sproti, ne da bi uporabniki za to sploh  avtomatsko branje samodejnih popravkov in velikih vedeli. Slednji zgolj opažajo, da je govor čedalje bolj razumljiv in začetnic, naraven ter da se v sistemu pojavljajo novi glasovi v slovenskem  branje menijev v načinu za slepe in slabovidne oz. jeziku. drugače prizadete (funkcija TalkBack). Vgradnja govornega bralnika v sam operacijski sistem mobilne Pri uporabi sintetizatorjev govora je namreč zelo pomembno, da je naprave (s podporo standardu Android Speech API, slika 3) zagotavlja največjo uporabniško dostopnost in najboljšo podprto čim več tipov elektronskega gradiva (npr. tudi pdf datoteke in internetni viri) ter čim več že obstoječe programske opreme. uporabniško izkušnjo. Avtomatsko začnejo delovati vse z Slovenski glasovi sedaj delujejo v vseh aplikacijah, ki uporabljajo operacijskim sistemom podprte funkcionalnosti govornega govorni bralnik operacijskega sistema Android; vključno z vsemi bralnika mobilne naprave tudi v slovenskem jeziku. Takšne aplikacijami za dislektike ter slepe in slabovidne. funkcionalnosti so npr.: Slika 2. Androidna aplikacija eBralec KSS 49 Slika 3. Vgradnja eBralca v izbirnike operacijskega Sistema Android Slika 4. Shema sistema eBralec za tvorjenje govora z uporabo PMM [4] Strežnik e-storitve DysLex je dostopen tudi programerjem, ki “country”:”SI”, bi želeli podporo sinteze (slovenskega) govora vnesti neposredno v “localVoice”:true, svoje programe. Odjemalec, ki je bodisi spletna stran bodisi “remoteServerName”:””, aplikacija, pošlje želeno besedilo na strežnik HTTP. Strežnik “remoteServerIP”:”localhost”} pretvori besedilo v govor, ki ga v obliki zvočnega zapisa pošlje Funkcija speak nazaj odjemalcu v predvajanje. Strežnik se nahaja na IP naslovu 141.ablak.arnes.si in posluša na vrata 80. Splošne funkcije:  Namenjena je pretvorbi besedila v govor.  Parametri: Funkcija info f=speak – strežniku pove, da hoče pretvorbo besedila v zvok  Namenjena je splošni poizvedbi o glasovih, ki so nameščeni na t=”” – besedilo, ki bo pretvorjeno v zvok strežniku DysLex. Odjemalec praviloma najprej kliče to v=”” – ime glasu za sintezo govora funkcijo, da lahko (po potrebi) v svojem uporabniškem s=”” – hitrost govora vmesniku prikaže spisek vseh razpoložljivih glasov. o= “” – kodiranje  Zahteva tipa GET (GET Request)  Primer sinteze: Primer: http://141.ablak.arnes.si/?f=info http://141.ablak.arnes.si/?f=speak&t=Primer&v=Renato%20  Odgovor tipa application/json Govorec&s=0&o=mpeg Pomembni parametri odgovora so: id, name, lang, country in Funkcija heartbeat age  Primer odgovora v JSON formatu:  Namenjena je za preverjanje delovanja strežnika. {“id”:”GovRenato”,  Odgovor s statusom 200 OK in besedilo Server is working! “name”:”Renato Govorec”,  Primer zahteve: http://141.ablak.arnes.si/?f=heartbeat “lang”:”slv”, Spletna stran projekta DysLex s podrobnejšimi informacijami se “gender”:”Male”, nahaja na naslovu http://dis.ijs.si/dyslex/. “age”:”Adult”, 50 3. SINTETIZATOR GOVORA 4. ZAKLJUČEK Umetno generirani govor mora zveneti naravno in biti prijeten za Predstavljeni e-storitvi in mobilni aplikaciji omogočata podporo poslušanje [3]. Pomembne so tudi nastavitve za hitrost branja in sintezi slovenskega govora v operacijskem sistemu Android, kar jakost zvoka ter možnost uporabe različnih glasov. pomeni, da lahko umetno generirani govor poslušamo v vseh eBralec (http://ebralec.si) je najnovejši sintetizator slovenskega aplikacijah (npr. PDF reader, brskalnik), ki sintezo govora podpirajo in so naložene na govora [4]. Prvenstveno je namenjen slepim in slabovidnim teh napravah. Za operacijska sistema iOS in Windows Phone takšna vgradnja žal še ni možna, ker uporabnikom ter osebam z motnjami branja. V primerjavi s proizvajalca operacijskega sistema te funkcionalnosti zaenkrat še preostalimi (predhodnimi) sintetizatorji govora za slovenski jezik omogoča občutno višjo stopnjo naravnosti sintetičnega govora. ne omogočata. Aplikacija eBralec je plod sodelovanja Instituta »Jožef Stefan« ter Na mobilnih napravah z Androidom lahko dislektiki ter slepi in podjetij Alpineon in Amebis. Vsebuje moški (eBralec Renato) in slabovidni sedaj uporabljajo vse (učne) pripomočke, izdelane za ženski (eBralec Maja) glas. Temelji na obsežnih govornih korpusih globalni trg ali že standardno vgrajene v napravo. (tabela 1) in sintezi govora s prikritimi markovovimi modeli Naravnost in razumljivost novega sintetizatorja slovenskega (PMM; slika 4). Na platformi Android je dosegljiv preko aplikacije govora eBralec (http://ebralec.si/) sta primerljivi s sintetizatorji eBralec KSS v »on-line« načinu delovanja. govora za druge jezike. Poslušanje takšnega govora ni več naporno, Velikost besednega 7.145.345 povedi zato je sintetizator primeren za najširši krog uporabnikov. korpusa 77 milijonov besed 5. ZAHVALA Obseg govorne zbirke Posebna zahvala gre Dejanu Kostadinovskemu in Janiju Bizjaku, 4.000 povedi ki sta sprogramirala strežniško in mobilno aplikacijo ter zagotovila 46.785 besed podporo standardu Android Speech API. 6 ur 3 min posnetkov za Operacijo je delno sofinancirala Evropska unija iz Evropskega ženski glas sklada za regionalni razvoj ter Ministrstvo za izobraževanje, znanost in šport. Operacija se je izvajala v okviru Operativnega 5 ur 33 min posnetkov za moški programa krepitve regionalnih razvojnih potencialov za obdobje glas 2007-2013, razvojne prioritete: Gospodarsko razvojna Število različnih difonov v 1.883 infrastruktura; prednostne usmeritve Informacijska družba. zbirki Razvoj eBralca je bil delno financiran v okviru projekta Knjižnica Število različnih trifonov v 21.369 slepih in slabovidnih Minke Skaberne. Operacijo je delno zbirki financirala Evropska unija iz Evropskega socialnega sklada. (24.702) Operacija se je izvajala v okviru Operativnega programa razvoja (št. kombinacij v korpusu) človeških virov, razvojne prioritete "Enake možnosti in spodbujanje socialne vključenosti", prednostne usmeritve "Dvig Tabela 1. Podatki o govorni zbirki eBralca [4] zaposlenosti ranljivih družbenih skupin na področju kulture in podpora njihovi socialni vključenosti". Govorec je starejši sintetizator govora, ki temelji na difonski sintezi. Prva različica je nastala v letu 1998, kasneje pa so se posamezni moduli nadgrajevali. Difonska sinteza govora zveni 6. LITERATURA IN VIRI precej manj naravno od korpusne [5, 6]. Vebuje dva moška glasova: [1] Šef, T., 2015. Univerzalni govorni e-bralnik za slovenski jezik Renato Govorec in Matej Govorec. Na platformi Android sta oba kot osebni učni pripomoček za ljudi z disleksijo in različnimi glasova dosegljiva tako preko aplikacije eBralec KSS kot preko vrstami motnje vida, Vzgoja in izobraževanje v informacijski aplikacije DysLex v »on-line« načinu delovanja. Oba glasova sta družbi - VIVID 2015 : zbornik referatov, Fakulteta za priložena tudi novemu sintetizatorju govora eBralec. organizacijske vede, str. 510-519, 2015. E-govorec (http://dis.ijs.si/e-govorec/) ponudnikom najraz- [2] Bravo, društvo za pomoč otrokom in mladostnikom s ličnejših e-vsebin omogoča dinamično podajanje informacij v specifičnimi učnimi težavami. (http://www.drustvo-bravo.si). govorni obliki ter v domačem slovenskem jeziku. Po kvaliteti oz. [3] Taylor, P., 2009. Text-to-Speech Synthesis, Cambridge naravnosti govora je enak Govorcu. Gre za brezplačno storitev, ki University Press. jo lahko uporablja kdorkoli. Do njega dostopa tudi aplikacija DysLex. [4] Žganec Gros, J., Vesnicer, B.,Rozman, S., Holozan, P., Šef, T., 2016. Sintetizator govora za slovenščino eBralec, Zbornik eBralec off-line je kompakten sintetizator slovenskega govora, ki konference Jezikovne tehnologije in digitalna humanistika, teče neposredno na mobilni napravi. Temelji na difonski sintezi. Pri Filozofska fakulteta, Univerza v Ljubljani, 2016. njegovem razvoju je bil poudarek predvsem na odzivnosti ter majhni potrebi po računski moči in pomnilniku. Posledica je [5] Šef, T., Gams, M., 2003. Speaker (GOVOREC): a complete občutno nižja kvaliteta oz. naravnost govora kot pri eBralcu. Slovenian text-to speech system . International journal of Njegova prednost je, da za svoje delovanje na mobilnih napravah Speech Technology, Kluwer Academic Publishers, št. 6, str. ne potrebuje podatkovne povezave. Vsebuje moški glas Aleš. Na 277-287, 2003. platformi Android ga v »off-line« načinu uporabljata tako [6] Šef, T. 2001. Analiza besedila v postopku sinteze slovenskega aplikacija eBralec KSS kot DysLex. govora. Doktorska disertacija, Fakulteta za računalništvo in informatiko, Univerza v Ljubljani, 2001. 51 Showing the Knee of a 4-D Pareto Front Approximation via Different Visualization Methods Tea Tušar Bogdan Filipič Department of Intelligent Systems Department of Intelligent Systems Jožef Stefan Institute Jožef Stefan Institute Ljubljana, Slovenia Ljubljana, Slovenia tea.tusar@ijs.si bogdan.filipic@ijs.si ABSTRACT 50 When decision makers select one or more trade-off solu- tions to a multiobjective optimization problem, they are 40 mostly interested in solutions residing at knees—regions of the Pareto front where a small improvement in one objective leads to a large deterioration in at least one other objective. 30 It is therefore important to be able to detect such regions, f 2 preferably through visualization. This paper presents visu- alizations of Pareto front approximations of a multiobjective 20 problem with knees. More specifically, we show how a sam- pled Pareto front of a four-objective problem with a single 10 knee looks like when visualized using seven different meth- ods (scatter plot matrix, bubble chart, parallel coordinates, radial coordinate visualization, level diagrams, hyper-radial 0 0 10 20 30 40 50 visualization and prosections). We can observe that while f1 the first four methods cannot visualize the knee, the remain- ing three are able to do so and are therefore more suitable for use in visualization of Pareto front approximations. Figure 1: The DEB2DK problem with a single knee (solutions near the knee are shown in red). 1. INTRODUCTION When inspecting a set of solutions to a multiobjective opti- mization problem, decision makers usually prefer solutions 50 lying at knees—regions of the Pareto front where a small improvement in one objective leads to a large deterioration 40 in at least one other objective [2]. Visualizing knee regions in a clear and concise way is therefore a crucial requirement 30 f for supporting the decision making process. Finding a good 3 20 representation of the Pareto front and its knees is rather straightforward in the case of two or three objectives (see 10 Figures 1 and 2 for two examples)1, but very challenging when the number of objectives is four or more [8]. 0 10 20 30 40 This paper presents visualizations of the single-knee instance 50 0 10 20 30 40 50 f1 of the DEB4DK problem [2, 7] using seven different meth- f2 ods, namely the scatter plot matrix, bubble chart, parallel coordinates [5], radial coordinate visualization [4], level dia- Figure 2: The DEB3DK problem with a single knee grams [1], hyper-radial visualization [3] and prosections [8]. (solutions near the knee are shown in red). All these methods have been previously used to visualize Pareto front approximations and have been analyzed with respect to some desired properties for visualization meth- ods [8]. 2. THE DEB4DK PROBLEM Branke et al. introduced a family of optimization problems Section 2 provides the formal definition of the DEB4DK with knees that are based on the DTLZ benchmark problems problem, while the visualizations with different methods are and scalable to any number of objectives [2]. While only shown in Section 3. Some concluding remarks are presented the instances with two and three objectives were presented in Section 4. originally, they were scaled to four and five objectives in [7]. 1Minimization in all objectives is assumed throughout this paper. Let us recall the formal definition of the DEB4DK problem 52 with four objectives: 60 60 60 π π π f1(x) = g(x)r(x) sin x1 sin x2 sin x3 2 2 2 40 40 40 π π π f 2 f 3 f 4 f2(x) = g(x)r(x) sin x1 sin x2 cos x3 20 20 20 2 2 2 π π f3(x) = g(x)r(x) sin x1 cos x2 0 0 0 2 2 0 20 40 60 0 20 40 60 0 20 40 60 π f f f f 1 1 1 4(x) = g(x)r(x) cos x1 60 60 2 n 9 X 40 40 g(x) = 1 + xi n − 1 f 3 f 4 i=2 20 20 r1(x1) + r2(x2) + r3(x3) r(x) = 3 0 0 0 20 40 60 0 20 40 60 3 cos(2Kπxi) r f2 f2 i(xi) = 5 + 10(xi − 0.5)2 + K 60 x ∈ [0, 1]n. 40 f 4 20 Here, n denotes the dimensionality of the decision space, while parameter K is used to control the number of knees 0 on the Pareto front. A DEBmDK optimization problem 0 20 40 60 with m objectives has Km−1 knees. This means that the f3 four-objective DEB4DK problem can be instantiated to have 1, 8, 27, . . . knees (for K = 1, 2, 3, . . . respectively). In the Figure 3: Scatter plot matrix of 300 Pareto-optimal rest of the paper, the value K = 1 yielding the Pareto front solutions of the DEB4DK problem with a single knee with a single knee will be used. (solutions near the knee are shown in red). 3. RESULTS OF DIFFERENT VISUALIZA- 60 TION METHODS 50 This section shows how different methods visualize the single 40 knee of the DEB4DK problem with K = 1. f 30 3 20 The set of solutions approximating the Pareto front (also called approximation set) was obtained by sampling the front 10 of the DEB4DK problem with 3000 points. Among them, 0 eight solutions closest to the knee were chosen as the so- 10 lutions “at the knee” to be emphasized in the visualizations 20 (analogous to the red knee solutions emphasized in Figures 1 30 f and 2). The underlying idea is that a visualization method 1 40 should be capable of showing the knee solutions in a way 60 50 50 40 that makes the knee recognizable to a decision maker. 30 60 20 10 0 f2 Since some methods have difficulties when visualizing a large number of solutions, only the first 300 solutions were used Figure 4: Bubble chart showing 300 Pareto-optimal in such cases. Out of the eight knee solutions contained in solutions of the DEB4DK problem with a single knee the original sample, five were retained in this smaller set. (solutions near the knee are shown in red). 3.1 Scatter Plot Matrix Let us first view the results of the scatter plot matrix—a approximation set with 300 solutions. Similarly as before, matrix of all possible 2-D projections of a multidimensional this plot shows that the red knee solutions have low values set of solutions. Figure 3 presents the scatter plot matrix in all four objectives, but we cannot perceive the presence for the smaller approximation set containing 300 solutions. of a knee. Because of the projection to a 2-D space, most of the in- formation is lost and the solutions at the knee cannot be 3.3 Parallel Coordinates differentiated from the rest of the solutions. Even an inter- Parallel coordinates [5] is a very popular method for visu- active exploration cannot be of much help in such a case. alizing multidimensional data that represents objectives as vertical parallel lines. Each solution is shown as a poly-line 3.2 Bubble chart intersecting each objective (i.e. coordinate) at its value. In A bubble chart is a scatter plot in which three objectives the case of our smaller approximation set (see Figure 5), the are shown on axes while the fourth objective is represented knee solutions do not stand out from the rest and therefore with point size. Figure 4 contains the bubble chart for the do not give a good idea regarding the shape of the approx- 53 60 f1 50 40 30 20 f f 4 2 10 0 f1 f2 f3 f4 Figure 5: Parallel coordinates of 300 Pareto-optimal solutions of the DEB4DK problem with a single knee (solutions near the knee are shown in red). imation set. Moreover, the method is very sensitive to the f3 number of displayed solutions. When this number is small, it can be very informative and is able to show dominance Figure 6: Radial coordinate visualization of 3000 relations between solutions. When, on the other hand, hun- Pareto-optimal solutions of the DEB4DK problem dreds or thousands of solutions need to be visualized, the with a single knee (solutions near the knee are shown cluttered result causes loss of most of the information. in red). 3.4 Radial Coordinate Visualization The radial coordinate visualization (also called RadViz [4]) 60 60 places objectives on the circumference of a unit circle and the solutions inside the circle so that the distance of a solu- 40 40 tion to each of the objectives is proportional to its value in that objective. For example, solutions with greater values in 20 20 Distance to Distance to the ideal point the ideal point f1 than in any other objective are placed closer to f1 than 0 0 the other objectives. Figure 6 shows the radial coordinate 0 20 40 60 0 20 40 60 visualization of the entire approximation set with 3000 so- f1 f2 lutions. Again, the knee solutions cannot be differentiated 60 60 from the rest in this plot. 40 40 3.5 Level Diagrams 20 20 Distance to Distance to Level diagrams [1] plot each solution against one of its ob- the ideal point the ideal point 0 0 jectives and the (Euclidean) distance to the ideal point. In 0 20 40 60 0 20 40 60 case of four objectives, four such diagrams are produced. f3 f4 Figure 7 presents the level diagrams for the smaller approx- imation set and we can clearly see the knee in all of them. Figure 7: Level diagrams for 300 Pareto-optimal so- lutions of the DEB4DK problem with a single knee 3.6 Hyper-Radial Visualization (solutions near the knee are shown in red). In hyper-radial visualization [3] the solutions are plotted against their distance to the ideal point (their hyper-radius) separately for two subsets of objectives. In our case, the the ideal point would probably be undetectable with these x axis represents the hyper-radius for objectives f1 and f2, visualization methods. Showing this is left for future work. while the y axis represents the hyper-radius for objectives f3 and f4. As shown in Figure 8, the eight knee solutions from the larger approximation set are clearly recognizable in this 3.7 Prosections visualization. Prosections [6, 8] are projections of only a section of the objective space at a time. For example, a section of width A note on the level diagrams and the hyper-radial visual- d is selected at angle ϕ on the plane f1f2. All solutions ization: while both methods are able to show the knee in that fall in this section are projected onto the line going this case, this is mostly due to the fact that the knee has through the ideal point at angle ϕ while all other solutions the shortest distance to the ideal point. In case of multi- are discarded. By showing a 3-D scatter plot with the x ple knees, for example, the knees that are not very close to axis representing the projected line and the other two axes 54 solutions is possible, it also requires to plot multiple visual- 40 izations (using various planes and angles) to get a complete view of an approximation set. Another advantage of this 35 method is that it can easily handle large sets of solutions because only a part of the set is visualized in a single plot. 30 4. CONCLUSIONS This paper presented how seven visualization methods pre- 25 viously used to visualize Pareto front approximations cope with visualization of knees. Only three methods (level dia- f 4f 20 f 3f grams, hyper-radial visualization and prosections) were able to show the single knee of the four-objective DEB4DK prob- 15 lem. Additional work is needed to see how these visualiza- tion methods perform on a problem with more knees. 10 5. ACKNOWLEDGMENTS This work is part of a project that has received funding from 5 the European Union’s Horizon 2020 research and innovation program under grant agreement No. 692286. This work was 0 partially funded by the Slovenian Research Agency under 0 5 10 15 20 25 30 35 40 research program P2-0209. f1 f f 1 2 f 6. REFERENCES Figure 8: Hyper-radial visualization of 3000 Pareto- [1] X. Blasco, J. M. Herrero, J. Sanchis, and M. Mart´ınez. optimal solutions of the DEB4DK problem with a A new graphical visualization of n-dimensional Pareto single knee (solutions near the knee are shown in front for decision-making in multiobjective red). optimization. Information Sciences, 178(20):3908–3924, 2008. [2] J. Branke, K. Deb, H. Dierolf, and M. Oswald. Finding 60 knees in multi-objective optimization. In Proceedings of the 8th International Conference on Parallel Problem 50 Solving from Nature, PPSN VIII, Birmingham, UK, 40 volume 3242 of Lecture Notes in Computer Science, f 30 pages 722–731. Springer, 2004. 4 [3] P.-W. Chiu and C. Bloebaum. Hyper-radial 20 visualization (HRV) method with range-based 10 preferences for multi-objective decision making. Structural and Multidisciplinary Optimization, 0 5 10 15 20 25 30 40(1–6):97–115, 2010. 35 f 40 0 10 20 30 40 50 60 f 1f2 3 [4] P. E. Hoffman, G. G. Grinstein, K. Marx, I. Grosse, and E. Stanley. DNA visual and analytic data mining. In Proceedings of the IEEE Conference on Visualization, Figure 9: Prosection 4D(0, f1f2, 45◦, 2) showing a part pages 437–441, Los Alamitos, CA, USA, 1997. IEEE. of 3000 Pareto-optimal solutions of the DEB4DK [5] A. Inselberg. Parallel Coordinates: Visual problem with a single knee (solutions near the knee Multidimensional Geometry and its Applications. are shown in red). Springer, New York, NY, USA, 2009. [6] T. Tušar and B. Filipič. Visualizing 4D approximation sets of multiobjective optimizers with prosections. In representing objectives f3 and f4, we get a 3-D representa- Proceedings of the Genetic and Evolutionary tion of this section of the 4-D objective space that can be Computation Conference, GECCO ’11, Dublin, Ireland, denoted as 4D(0, f1f2, ϕ, d). Figure 9 shows the prosection pages 737–744. ACM, 2011. 4D(0, f1f2, 45◦, 2), that is, a prosection on the plane f1f2 using the angle ϕ = 45◦ and section width d = 2. The knee [7] T. Tušar and B. Filipič. Scaling and visualizing points are clearly visible in this plot. In theory, prosections multiobjective optimization test problems with knees. should be able to visualize any number of knees regardless of In Proceedings of the 15th International their position (using multiple plots—each visualizing a dif- Multiconference Information Society, IS 2012, ferent section of the space), but experiments to demonstrate Ljubljana, Slovenia, volume A, pages 155–158, this are still needed. Ljubljana, Slovenia, 2012. Jožef Stefan Institute. [8] T. Tušar and B. Filipič. Visualization of Pareto front The advantages and disadvantages of prosections originate approximations in evolutionary multiobjective from the fact that only a section of the space is visualized at optimization: A critical review and the prosection a time. While this means that a visualization of high accu- method. IEEE Transactions on Evolutionary racy that mostly preserves the dominance relations among Computation, 19(2):225–245, 2015. 55 Machine learning method for stress detection with an EEG device Martin Gjoreski, Mitja Luštrek, Matjaž Gams Department of Intel igent Systems, Jožef Stefan Institute Jožef Stefan International Postgraduate School Jamova cesta 39, Ljubljana, Slovenia {martin.gjoreski, mitja.lustrek, matjaz.gams}@ijs.si ABSTRACT Electroencephalography (EEG) is a method for monitoring the We present a study for stress detection using a one-channel electrical activity of the brain. It reflects the upper cognitive commercial EEG device. For this purpose, a stress-inducing functions and mental or psychological states. In general, increased experiments were performed. 21 subjects were monitored while activity in the higher frequency bands is correlated with arousal, performing mental-arithmetic tasks under time and evaluation thinking or emotional activity. For example, the midrange beta pressure. The data was used to develop a machine learning frequency band (16–20 Hz) is considered to be correlated with method for stress detection which consists of: a segmentation emotional “intensity” that may come from anxiety [20]. In phase, a filtering phase, a feature extraction phase and a model- contrast, a lower spectral activity, e.g., activity in the delta learning phase. The initial evaluation results are promising and frequency band (0.1Hz to 3Hz) is correlated with deep sleep. showcase that person-specific models can perform for a 19 The recent technological advancements have made cheap percentage points better than the majority class. commercial EEG devices generally available. Depending on the quality of the device, they may capture the contrasting sleep- thinking working modes of the brain or possibly even stress/no- Categories and Subject Descriptors stress situations. In this study, we explore the use of a single J.3 [Computer Applications]: Health channel EEG device for stress detection in office-like conditions. Keywords 2. RELATED WORK Stress detection; EEG device; machine learning; health. Even though there are numerous studies on stress using psychology instruments or hormones analysis, the number of 1. INTRODUCTION stress studies using electroencephalogram (EEG) for stress Stress is a process triggered by a demanding physical and/or detection is quite low. This indicates the complexity of the psychological event [6]. It is not necessarily a negative process, problem [19]. Furthermore, the use of a commercial one-channel but when present continuously it can result in chronical stress, device makes the problem of stress detection even more which has negative health consequences such as raised blood challenging. pressure, bad sleep, increased vulnerability to infections, slower Peng et al. analyzed methods for identifying chronic stress by a body recovery and decreased mental performance [7]. Regarding medical 3-chnnel EEG device. They used statistical analysis to the economic costs of stress, the European Commission projected distinguish between relaxed state and chronical stress. They have the costs of work-related stress to €25 billion a year. This is concluded that there are different hemispheric activities between because work-related stress leads to increased number of injuries, left and right hemispheres for the no-stress and stress subjects. increased absenteeism and decreased productivity [3]. Therefore, Similar findings were presented by Giannakakis et al. [19] who a stress-detection system would be useful for self-management of were detecting stress in subjects while watching videos using a 32 mental (and consequently physical) health of workers [5], students channel EEG device. They have asserted that EEG response to and others in the stressful environment of today’s world. positive valence stimuli is left frontal activity, while negative To develop a stress-detection method, we must better understand valence stimuli cause an increase to right activity. However, these the stress process. When humans undergo a demanding physical changes cannot be captured with a single channel EEG device. and/or psychological event (e.g., meeting, exam, etc.), the body is Recently, Saeed et al. [21] presented a study for stress detection faced with a large physiological stressor which invokes a response using a one channel EEG device. They tried to distinguish of the sympathetic nervous system to meet the increased metabolic between high and low stress based on a 3 minute data labelled demands [1]. The sympathetic nervous system speeds up certain using a Perceived Stress Scale questionnaire. This questionnaire processes within the body (“fight-or-flight” response) [2]: it raises gives information about stress levels accumulated over longer the heart rate, raises sweating rate, increases blood pressure, etc. period of time since it contains questions such as: “In the last All of these changes are controlled by the brain, thus it is expected month how many times …”. for the brain’s activity to be changed in a stressful situation. In addition, it has been reported that there are neuro-chemical In our study, stress is induced using a standardized method, and it changes due to stress response [10]. is assessed using State-Trait Anxiety Inventory (STAI) questionnaire, which provides momentary information. Thus, we 56 are trying to detect different stress levels over a short period of Table 1. Mean STAI score for the experimental data. time (e.g., 25 minutes). Type Before Easy Medium Hard In the literature there are other approaches for stress detection using wearable devices which capture physiological signals such STAI score 10.95 13.33 14.05 13.81 as heart rate, sweating rate, R-R intervals, skin temperature and similar [22][23][24]. However, these approaches are out of the For each subject, the period before answering the STAI scope of this study. questionnaire in which they achieved the lowest score is labelled as low stress, and for each +3 STAI points (the statistical tests 3. DATA showed that difference of 2.38 is enough), the stress label is A standardized stress-inducing method was used for collecting the increased by one, thus we get low stress (lowest STAI score), experimental data. A web application was developed in medium stress (lowest STAI score +3) and high stress (lowest collaboration with psychologists, which implements a variation of STAI score +6). In the final experiments the medium and high the stress-inducing method presented by Dedovic et al. [4]. The stress were merged because only two subjects achieved a high main stressor is solving a mental arithmetic task under time and level of stress, so we had two degrees of stress low and high. The evaluation pressure. In short, a series of randomly generated overall duration of the low stress data was 840 minutes and the equations were presented to subjects, who provide answers duration of the high stress events was 724 minutes. verbally. The time given per equation was dynamically changing. For each two consecutive correct answers the time was shortened 4. METHOD by 10%, and for each two consecutive wrong answers the time The proposed method is a machine-learning pipeline presented in was increased by 10%. Each session consisted of three series of Figure 2. During the experiments, the data was streamed to a equations with increasing difficulty: easy, medium and hard. Each smartphone via Bluetooth, then transferred to a computer where series of equations lasted for five minutes. For motivation, a the rest of the processing was performed. The method contains reward was promised to the top three participants. After each several phases: filtering, segmentation, feature extraction and stage, the participant was shown false ranking score, positioning model learning. In the next subsections, each stage is described in him/her in the top five, and this way motivating him/her to try details. harder in the next stage and try to win the award. The application is available on-line: http://dis.ijs.si/thestest/. During the experiments, the subjects were monitored with a NeuroSky's MindSet as presented in Figure 1. It is a wearable EEG that provides raw EEG data with a 512Hz sampling frequency by an electrode placed on left side of the forehead (at the FP1 location). In addition, the device provides computed 1Hz data for 7 frequency sub-bands, information for meditation intensity, information for attention intensity and information for detected blinks. Figure 2. The proposed method for stress detection from EEG. 4.1 Filtering and Segmentation After a visual inspection of the data, it was obvious that the data contains noisy segments which needed filtering. For detecting outliers (noise) in the data we used Tukey’s method of leveraging the interquartile range [11]. This method is independent of distributional assumptions and it ignores the mean and standard deviation, thus it is resistant to extreme values in the range. After the detection, each outlier is replaced with the average value of the two neighboring samples. Figure 3 presents an example data of one subject. On the x-axis is the time in seconds and on the y- Figure 1. Experimental setup. Subject 1 performing the stress axis is the amplitude of the high beta band (20-30 Hz). The inducing experiment. bottom graph is the raw data as provided by the device and the top graph is the same data after filtering. Four Short STAI-Y anxiety questionnaires [8] were filled by each participant: before the experiment (1), and after the easy (2), medium (3) and hard session (4). The STAI questionnaire reflects the degree to which the subjects have a feeling of unease, worry, tension, and stress. Its value range is from 0 to 24, with 24 being the highest level. The mean STAI score is presented in Table 1. We performed statistical analysis using repeated measures ANOVA [9]. The resulting p-value was 0.0014, confirming that there is a statistically significant difference in the answers for the intensity of stress. The answers of the STAI questionnaire were Figure 3. Data of one experimental session for Subject 3. A used for subject-specific labelling of the data. raw signal (bottom) and the same signal after filtering (top). 57 4.2 Feature extraction extracted come from one source, the raw EEG data. Naïve Byes is In the feature extraction phase, a sliding-window technique was well known for performing poorly on correlated datasets due to its used. From each window, numerical features were calculated “naïve” assumption that features are not correlated. On the other using statistical functions (mean, variability, quartiles, quartile hand, the best performing classifier is Random Forest. This is due deviation and power) and regression analysis (intercept and slope to its noise robustness. The highest achieved accuracy is 71% of a fitted regression line). The features were computed from each which is for 19 percentage points better the majority classifier. data type provided by the device (raw EEG data downsampled to Table 2. Accuracy (%) for each of the ML methods for a 4Hz because the high frequency does not influence significantly varying size of the feature extraction window. the computed features, frequency bands’ data, meditation data, Feature-extraction attention data and blinks data). In addition, blinks per second, significant blinks per seconds and power of significant blinks was ML window (minutes) computed from the blinks data. A blink was considered significant Classifier 0.5 1 2 4 if its value was over some threshold (80), which was chosen Majority 52 52 52 52 experimentally. Naïve Bayes 55 53 53 54 4.3 Model learning Boosting 57 57 60 55 For the model learning we used the machine learning algorithms J48 60 63 57 56 as implemented in the WEKA machine learning toolkit [25]. We KNN (3) 62 61 64 66 experimented with variety of ML algorithms: SMO 64 64 64 63 • Majority classifier – always predicts the majority label. Bagging 66 65 66 63 This is a baseline model. Random Forest 70 71 71 66 • J48 – algorithms for building a decision tree (DT) [12]. 6. CONCLUSION AND DISCUSSION • Naïve Bayes classifier – algorithm for building simple We presented a study for stress detection with a one-channel probabilistic classifiers based on Bayes' theorem with strong commercial EEG device. For this purpose, a stress-inducing independence assumptions between the features [13]. experiments were performed allowing us to gather EEG data for 21-subjects. This data can be utilized for studying the brain • KNN – simple algorithm which provides a prediction based activity under stress, induced using time and evaluation pressure. on k-nearest instances (from the training set) to the test instance 0. In addition, we presented a machine learning method for stress detection including segmentation technique, filtering technique • SVM – algorithm for building a classifier where the and a feature extraction technique. The initial evaluation results classification function is a hyperplane in the feature space are promising, but still there is a room for improvements. [15]. An inter-study comparison with the related work is not directly applicable mainly due to the use of different devices, different • Bagging – ensemble algorithm which learns base models on subsets of train data with the purpose of reducing variance datasets, different number of subjects etc. As an example, Vanitha and avoiding overfitting [16]. The algorithm for learning et al. [26] presented a system for real time stress detection based base models was J48. on 14-channel EEG device. The accuracy of the proposed machine learning models varied form 70-90%. This clearly • Boosting – ensemble algorithm which learns models on suggests that better results can be achieved with EEG devices that subsets of the train data and “boosts” the weights of provide data from more channels. misrecognized instances allowing for the models in the The evaluation technique used in the experiments is a 10-fold ensemble to focus on the misclassified instances [17]. The cross-validation which is person-depended evaluation approach. algorithm for learning base models was J48. We also experimented with a person-independed evaluation • Random Forest – ensemble method which learns base technique (leave-one-subject-out cross-validation) and the decision trees by subsetting the feature set [18]. performance of the models (in terms of accuracy) was near the majority classifier. There may be several reasons for this: 5. EXPERIMENTAL RESULTS • Subject specificity. The EEG data may be subject-specific We performed experiments with a varying size of the sliding because each person has a specific reaction to such stressor. window used in the feature extraction phase (size of the feature- The specific reaction may not allow to build general ML extraction window). We experimented with a window size of 0.5, model trained on one set of subjects and tested on another 1, 2 and 4 minutes. For evaluation we used 10-fold cross- set of subjects. validation technique. The results are presented in Table 2. Each • Noise. The commercial EEG device may not be powerful row represents the accuracy achieved by the specific classifier and enough to capture some of the subtleties which would allow each column represent the size of the feature-extraction window. building general ML model. Each recorded session may be The results show that the feature-extraction window size does not different due to persistent noise in the data. influence significantly the performance of any of the classifiers. The worst performing classifier is Naïve Bayes. This is due to the • Advanced features. The extracted features in this study are correlation of the features in the dataset. A high correlation is based on statistical and regression analysis. In future, more expected since all of the signals from which the features are advanced features can be extracted to improve the models’ performance and generalization. 58 7. REFERENCES [1] A. Angeli, M.minetto, A. Dovio, P. Paccoti, “The [14] D. Aha, D. Kibler, “Instance-based learning algorithms,” overtraining syndrome in athletes: A stress-related disorder”, Machine Learning Vol. 6, pp. 37–66, 1991. Journal of Endocrinological Investigation, 2004, pp 603-612. [15] N. Cristianini, J. Shawe-Taylor, “An Introduction to Support [2] W. B. Cannon. The Wisdom of the Body. 1932. Vector Machines and Other Kernel-Based Learning [3] Calculating the cost of work-related stress and psychosocial Methods,” Cambridge University Press: Cambridge, 2000. risks European Risk Observatory Literature Review [Online]. [16] L. Breiman, “Technical Report No. 421,” 1994. Available: https://osha.europa.eu/en/tools-and [17] K. Michael, “Thoughts on Hypothesis Boosting,” publications/publications/literature_reviews/calculating-the- Unpublished manuscript, December 1988. cost-of-work-related-stress-and-psychosocial-risks, [Accessed 15.10.2014]. [18] H. Tin Kam, “Random Decision Forests,” International Conference on Document Analysis and Recognition, [4] K. Dedovic, R. Renwick, N. K. Mahani, and V. Engert, “The Montreal, QC, Canada, pp. 278–282, 1995. Montreal Imaging Stress Task : using functional imaging to investigate the ...,” vol. 30, no. 5, pp. 319–325, 2005. [19] G. Giannakakis, D. Grigoriadis, M. Tsiknakis, “Detection of stress/anxiety state from EEG features during video [5] Fit4Work Project, http://www.fit4work-aal.eu/ watching,” IEEE Eng Med Biol Soc. pp. 6034-7, 2015. [6] H. G. Ice, G. D. James. Measuring Stress in Humans: A [20] P.M. Lehrer, R.L. Woolfolk, W.E. Sime. “Principles and practical Guide for the field,” Cambridge university press. practice of stress management,” Guilford Press, 2007. 2007. [21] S. Saeed, S. M. Anwar, M. Majid, A. Bhatti, “Psychological [7] S.C. Segerstrom, G.E. Miller. Psychological stress and the Stress Measurement Using Low Cost Single Channel EEG human immune system: a meta-analytic study of 30 years of Headset,” International Symposium on Signal Processing and inquiry. Psychological Bulletin 130, 2004, 601 Information Technology, 2015. [8] C. D. Spielberger. State-Trait Anxiety Inventory Anxiety. [22] M. Gjoreski. “Continuous Stress Monitoring Using a Wrist vol. 19. p. 2009, 1987. Device and a Smartphone,” Master thesis, Jozef Stefan [9] T. Eftimov, P. Korošec, B. Koroušic Seljak. Disadvantages International Postgraduate School, Slovenia, 2016. of Statistical Comparison Of Stochastic Optimization [23] J. A. Healey, R. W. Picard, “Detecting Stress During Real- Algorithms. Proceedings of the Bioinspired Optimizaiton World Driving Tasks Using Physiological Sensors,” IEEE Methods and their Applications, BIOMA 2016. Transactions on Intelligent Transportation Systems, 2005. [10] McEwen, Bruce S., and John H. Morrison. "The brain on [24] K. Hovsepian, M. Absi, T. Kamarck, and M. Nakajima, stress: vulnerability and plasticity of the prefrontal cortex “cStress : Towards a Gold Standard for Continuous Stress over the life course." Neuron 79.1 (2013): 16-29. Assessment in the Mobile Environment,” ACM Conf. on [11] J.W. Tukey. “Exploratory data analysis.” Addison-Wesely, Ubiquitous Computing, 2015. 1977. [25] M. Hall, et al. “The WEKA Data Mining Software: An [12] J.R. Quinlan, “Improved use of continuous attributes in Update,” SIGKDD Explorations, Vol 11, Issue 1, 2009. c4.5,” J. Artif. Intell. Res., Vol. 4, pp. 77–90, 1996. [26] V. Vanitha, P. Krishnan, “Real time stress detection system [13] R. Stuart, N. Peter, “Artificial Intelligence: A Modern based on EEG signals.” Biomedical Research, pp. 271-275, Approach,” 2nd Ed., Prentice Hall: New York, USA, 2003. 2016. 59 DESIGN OF MOBILE SYSTEMS THAT ENABLE AN ACTIVE AND MEANINGFUL LIFE FOR DEMENTIA PATIENTS David A. Fabjan Ambient Intelligence Research Group Department of Intelligent Systems Jožef Stefan Institute Jamova cesta 39, 1000 Ljubljana, Slovenia Tel: +386 1 477 3865; fax: +386 1 4251038 E-mail: david.fabjan@ijs.si ABSTRACT Cognitive impairment may be determined by multiple The paper presents an outlook on two expert mobile health neuropathological processes including neurodegenerative systems, proposed for the benefits of prolonged and diseases, by far the most common being Alzheimer disease, meaningful living for people in the early stages of and vascular disease. In fact, most frequently, cognitive Dementia. impairment is the result of more than one disease process (degenerative and vascular) [1]. Cognitive impairment The research group has recently proposed such prototype represents a progressive process that gets worse over time systems to be built and hopefully used commercially. Both and has no foreseen completely effective cure. systems help people with mild cognitive impairments serving as memory aid in one case, and as a cognitive The main goal for people affected by Dementia is the ability training tool in the second case, which are both presented in to maintain an acceptable level of autonomy that would this exploratory paper. allow them to live a quality life at home, despite the limitations imposed by the condition. Similarly, family The main goal of proposed prototypes is to enable and caregivers and care professionals of nursing homes and day maintain an acceptable level of autonomy that would allow care facilities are in need of technological tools to help the affected people to live an active life, despite the limitations people with Dementia develop a coping strategy concerning imposed by the condition. their illness. Keywords The quality of life is associated with health, social Mobile systems, Dementia, cognitive impairments, ambient relationships and ability of using resources such as shops, sensors, context aware systems, prediction, wearables, care transportation and housing [3]. Those people suffering from giving. dementia score poorly on all these criteria; they are frequently ill, have difficulties in maintaining social 1 INTRODUCTION relationships, and are unable to utilise the resources at their disposal. To improve the quality of life, people with Dementia is one of the most frequent neurological disorders dementia need to maintain most of their personal affecting the elderly population. It is estimated that in 2015, independence, engage in social interaction, and must keep there were 46.8 million people living with dementia themselves occupied [4]. worldwide, with one new case every 3 seconds [1]. A disproportionally large number of them – 10.5 million – These factors are interrelated and there is no consensus live in Europe, since it has much older population on about their relative importance, although some literature average than other continents. suggests that independence in the activities of daily living is the most important one [5]. To address these important This spectrum of diseases with common name Dementia is factors, the research group in collaboration with the characterized by a decline in one or more of the seven international team of medical and social experts on cognitive domains (learning, memory, language, executive Dementia and specialists in “serious games” and human- function, attention, and perceptual-motor) from the previous computer interaction, proposed two intelligent mobile (normal, functioning) level of functionality. Such decline is systems. severe enough to interfere with normal daily activities and leads to personal dependence on external help even when 2 SMART COACH performing basic tasks such as hygiene and daily diet. First system is re-hashing an old idea of using short reminder notes (post-it, stickers), strategically placed to 60 remind us of important or frequent tasks. Generally, external learning, sensor fusion and heuristics to infer information memory aids are the most frequent strategy for dealing with such as the user’s location (in the kitchen, on the sofa) and the memory loss caused by Dementia [6]. Reminder notes activity (cooking, resting). Most methods at this stage will are usually placed in various locations around the home – run on the sensing devices (computer vision and sound those for meal preparation in the kitchen, for personal analysis on the tablet) to reduce the communication hygiene in the bathroom, etc. overhead. However, the reminder notes should also depend on what The second stage will create a global awareness of the user’s the person is currently doing: while putting clothes in the situation involving all the sensing devices. Machine learning washing machine - on how to operate it, on how to use and sensor fusion are used to recognize only parts of the shampoo; when going to the kitchen in the morning - about user’s situation. Next step will involve decision-support making coffee, about drinking water, about taking medicine, methods such as proprietary hierarchical qualitative decision etc. If one wanted to have a paper note for every occasion, models and expert defined rules to completely determine the though, they would become too numerous to be of use and it user’s situation and the most probable intent. Algorithms for would become very confusing for the individual, particularly these methods will run remotely in the cloud. when trying to follow certain timeline of tasks. The third stage will use symbolic reasoning based on an The system intends to develop intelligent reminder notes ontology and rules to select the most appropriate reminders that are context-sensitive. To identify the activity of the user, and instructions based on the user’s situation evaluated and we have to use various types of ambient sensors to detect inferred in the second stage. Methods in this stage will be what the person is doing. In addition, we will be able to also implemented in the cloud, returning the output to the utilize a pre-programmed schedule of the person’s normal display devices. and procedural daily activities (eating, personal hygiene, exercises) as well as learn individual’s personal habits. This combined can give us a very good idea about what exactly the person is doing and what the person wants or needs to accomplish at a given moment. This way we will be able to display just the right reminder with the correct steps to follow at the appropriate time. 2.1 TECHNOLOGY AND METHODOLOGY The farfetched technology for the system would be smart glasses or even smart contact lenses that would sense and analyse the user’s environment, interpret context and intent, and provide the appropriate reminders or procedural Figure 1. The Architecture of the smart coach system instructions directly in front of the eyes [7]. From proven technology, available in the market, the following options 2.2 EXPECTED IMPACT were selected: Each person with dementia living at home is cared for by Display: Tablets mounted on strategic locations in the user’s two or three carers on average. With the growing number of home and smart glasses, almost indistinguishable from dementia cases, this “dependency ratio” is difficult to regular glasses; sustain, resulting in a high demand for care solutions, and representing a growing burden in terms of economic, social Sensing: Camera and microphone in the tablet, we use a and human costs. camera and perform vision and sound analysis, intended to detect the presence of the user, basic activities such as The main impact of the expert mobile system is to prolong handling dishes, and environmental events; 3D sensing the independence and living at home for people with mild technology similar to Microsoft Kinect; a wristband with and moderate dementia. We want to extend such living for at multiple sensors for user’s activities recognition; RFID least two years before constant care and living in nursing reader that can detect tags attached to various important homes is needed. objects in the user’s home; localisation system using Bluetooth Low Energy beacons to detect the exact user’s Staying in home environments has the smallest socio- location. economic cost, which is important due to demographic changes where the population is getting older. The interpretation and reasoning from the data acquired will be done in three stages. The first stage will use machine 61 3 PLAN AND EXECUTE DAILY ACTIVITIES professionals will have a library of learning modules and games that will simplify their job in training the users to live The second system proposed has the objective of cognitive well with dementia; the families of the person with dementia training [8] to exercise the remaining cognitive abilities of will be able to understand those activities in which their people with mild cognitive impairment and moderate forms relatives find more difficulties to help them with the coping of Dementia to help them remain independent as long as strategy. possible. 3.1 TECHNOLOGY AND METHODOLOGY While cognitive training cannot fully reverse the damage caused by the disease, it can slow it down considerably and The three main components of the prototype are a serious even improve the abilities in specific areas. Research shows game for cognitive training of executive function, a mobile [3] that cognitive training is particularly effective when it is device companion to track the progress of the execution of personalized, taking into account the person’s specific the trained activities in real life, e-learning methods to train situation and goals – this type of cognitive training is the users (people with Dementia and their carers) in the use sometimes called cognitive rehabilitation. The training can of the system, and a back-end to connect all the components. address various cognitive functions: memory, attention, executive function and others. While there are several applications available for cognitive training of memory and attention, the executive function has been neglected, which is exactly what this system is aiming to address. Executive function is the ability to execute tasks that involve planning and decision making. It helps us deal with the procedural tasks necessary to perform required set of activities in daily life that are not well rehearsed, such as shopping, preparing meals and visiting a doctor. The core of the system developed in this project will be a serious game that will simulate such tasks, including various mishaps that can occur (forgetting a wallet while going Figure 3. The Architecture of the 2nd prototype system shopping, the shop being out of an ingredient for the planned meal, taking a wrong bus when going to the doctor). This Back-end of the system supports the previous three will enable the users to train and prepare themselves for components with: serious game scenarios defined during the these tasks in the safety of their own homes on a tablet or on gathering of user requirements; serious game configuration a TV without the stress brought about by mishaps in the real input by the care giver; user medical and psychological world. profile, preferences and the history of the use of the system; a module for advanced data analysis for the serious game executions in terms of personalisation, planning and analysis of the progress. E-learning materials are using an existing technology OpenEDX. The OpenEDX will be extended with algorithms to render the learning path of the users dynamic and to consider their preferences, physiological values and behaviour. 3.2 EXPECTED IMPACT Although a technological intervention cannot replace live social interaction, the system is expected to have a positive personal impact on the affected person’s interaction with friends and care givers. Figure 2. The concept of the serious game system The system should increase a quality of life and ease the People with Dementia will have an effective way to evaluate burden of family and care givers. The care givers will be which exercises and e-learning modules allow them to able to interact with the users remotely trough the cognitive- perform their activities better; family carers and care 62 training game in a playful manner, re-acting the scenarios in support sophisticated machine learning algorithms, devices home environments. such as smartphones, can build predictive models of the context. Based on these models, actionable decisions that 4 INVOLVEMENT OF THE END USERS impact the future state of the system and/or its environment can be made. [9] The end-users involvement and systems adaptation to their needs is crucial for the success of the proposed systems. The 6 ACKNOWLEDGMENT target groups involve people with mild to moderate dementia, formal carers such as medical, pharmaceutical and This work would not be possible without the insights and outpatient services and care givers from dementia thoughtful discussions with Dr. Mitja Luštrek, Head of the institutions, and private subjects, and informal care givers. Ambient Intelligence Research Group and Božidara Cvetković, Department of Intelligence Systems at JSI. Both We will investigate the requirements of the target groups, have contributed with excellent knowledge on recent primary, secondary and tertiary end users. User requirements advances in the domain of ambient intelligence and will be collected separately for each of the implied groups, possibilities of mobile computing. I would like to thank as the motivations, needs and attitude towards the proposed Martin Frešer, the young researcher in the group, for technology will vary. helping to finalize the prototypes’ design. The testing follows three principal objectives including: Special thanks go to all the external experts for all their exploratory studies regarding user-requirements (both invaluable advice and their contribution. I especially wish primary users and formal and informal care givers); input to express my gratitude for contribution of Slovene experts: and feedback in the development process, including the prof. Zvezdan Pirtošek, Head of Neurology Division at review of testing of prototypes in pilot studies; and finally University Medical Centre Ljubljana, and Štefanija Lukič the retrieval of input for the further development and Zlobec, the President of the Slovenian Association for Help potential commercialization. with Dementia – Spominčica, Alzheimer Slovenia. 7 REFERENCES [1] Alzheimer's disease International. World Alzheimer Report 2015: The Global Impact of Dementia. (2015) [2] Sonnen JA, Larson EB, Crane PK, et al. Pathological correlates of dementia in a longitudinal, population- based sample of aging. Ann Neurol. 2007 Oct. 62(4):406-13 [3] Keating N, Gaudet N. Quality of life of persons with dementia. The Journal of Nutrition, Health & Aging 16(5). (2012) Figure 4. The involvement of the end-users [4] Moyle W, Fetherstonhaugh D, Greben M, Beattie E. Influencers on quality of life as reported by people living with dementia in long-term care: A descriptive Testing is an inclusive, user-centred approach. The exploratory approach. BMC Geriatrics 15. (2015) involvement of end-users shall be iterative through multiple [5] Andersen CK, Wittrup-Jensen KU, Lolk A, Andersen cycles of design, testing and adaptation of a prototype. K, Kragh-Sørensen P. Ability to perform activities of daily living is the main factor affecting quality of life in 5 CONCLUSION patients with dementia. Health and Quality of Life Outcomes 2(52). (2004) In this exploratory paper we provide an insight into the parts [6] Beard RL, Knauss J, Moyer D. Managing disability and of past year’s research and work by the ambient intelligence enjoying life: How we reframe dementia through group. These ideas and results are possible when combining personal narratives. Journal of Aging Studies 23(4). technology and methods with initiatives and in-depth (2009) knowledge by the experts form health and social domains. [7] Web page https://www.wareable.com/wearable- tech/best-smart-contact-lens (September, 2016) Mobile pervasive systems and devices have seen tremendous [8] 5 Bennett DA, Schneider JA, Arvanitakis Z, et al. advances in recent decades. Ambient intelligence is Neuropathology of older persons without cognitive becoming more and more pervasive with every new impairment from two community-based studies. generation of sensors with the added modalities and Neurology. 2006 Jun 27. 66(12):1837-44 advances in miniaturisation processes. Equipped with an [9] Pejovic V. Roadmap to Anticipatory Mobile Computing, array of sensors and powerful processing hardware that can Elektrotehniški vestnik 81(5): 247–250, 2014 63 SeaAbs: developing a low cost seafloor observatory Aleksandar Tosić Matjaž Kljun University of Primorska, FAMNIT University of Primorska, FAMNIT Glagoljaška 8 Glagoljaška 8 Koper, Slovenia Koper, Slovenia aleksandar.tosic@upr.si matjaz.kljun@upr.si ABSTRACT Two of the key challenges of seafloor observatories are con- Emphasis on ocean science in the past two decades has led stant communication capabilities and power requirements. to an increased demand for maritime observation and data The most widely used approach today is to have a network collection. Such data is commonly collected by specially of cables put in place [7, 3] in order to connect observato- designed maritime observatories which are used by various ries to internet and power them. This makes such solution stakeholders (e.g. civil society, NGO, government agencies, relatively expensive for general use [4]. Another approach policy makers, researchers) for various purposes (e.g. en- to this problem is a buoy based design [2], which is usually vironmental protection, development of sea infrastructure, making use of satellite transmission. However, such solu- bolstering tourism, climate change prediction). Most seafloor tions can limit the possible placement location greatly and explorations have so far been completed either using re- there is still a power supply in question. The later is also motely operated underwater vehicles (ROV’s) or static ob- true for all acoustic based communication systems. servatories placed on the sea floor. The latter are arguably better suited for observing the dynamics of the oceans for longer periods of time. A lot of effort has also been invested into developing of on-line communication capabilities rang- ing from optic cables to underwater acoustic wireless com- munication systems. However, such systems demand large monetary investment, which is often an obstacle especially for above mentioned civil society, NGOs and researchers. In pursuit to make seafloor observation affordable for a larger society and to progress the ocean science we have built a seafloor observatory as an open-source platform aiming to increase and speed-up sea-floor exploration, data gathering and data sharing. Categories and Subject Descriptors C.2.3 [Computer systems organization]: Embedded and cyber-physical systems—Sensors and actuators General Terms Figure 1. SeaAbs: a low cost seafloor observatory Seafloor Observatory Science, Seafloor observatory before materials are mounted on To our knowledge, with the exception of bouy Vida operated Keywords by Marine Biology Station Piran (described in e.g. [13]), Seafloor Observatory, Maritime Data, Sea Exploration, Seafloor there are no other initiatives directed towards collecting Observatory Science ocean data with the scientific objective of long-term moni- toring of environmental processes in Slovenian waters. This 1. INTRODUCTION is the case despite an increasing worldwide demand for such time-series data sets. Such measurements do not necessar- Oceans are still one of the least explored ecosystems despite ily need constant power supply and networking capabilities, the fact that they greatly influence our climate, and con- which we took into consideration while building our system. sequently environment and life on our planet. There has In this paper we describe our off-line approach suitable for been a growing trend in the past decades of moving from obtaining time-series data sets where analysis can be done exploratory research with underwater vehicles to in-situ ob- subsequently. To achieve this, we designed a prototype that servations [5]. Oceans are dynamic and best observed from was built using available of-the-shelf components. a static point of view. Dynamic observation can affect the measurements, which introduces uncertainty as if the mea- surements are results of the dynamics of the observatory itself [9, 7, 8, 11]. 64 2. MOTIVATION Our motivation comes from the need to understand how dif- ferent materials introduced to marine environments affect its wildlife. One such example are man-made structures intro- duced to the underwater forming artificial reefs. These are placed usually on the featureless bottom seafloor to promote wildlife or prevent erosion. Artificial reefs are most com- monly created by scuttling ships, oil rigs, or are purposely built using construction debris, tires, concrete or plastic (e.g. PVC). These materials are likely to cause pollution by re- leasing substances and (micro)particles not naturally found in sea environment and often become part of the sea-life food-chain [6]. As inferred above, material interference with live (marine) ecosystems must be done with great care. For this purpose we built an observatory that is designed to monitor how dif- Figure 2. SeaAbs seafloor observatory: on the right ferent materials are absorbed into marine environment. The is the tube holding all necessary hardware includ- purpose of our observatory is to identify appropriate mate- ing sensors and camera, while on the left are placed rials for artificial reef constriction and provide guidelines as different materials to which materials are better absorbed by marine environ- ment than others. In addition we will make all collected data freely available for others to carry out further research. In this paper we describe hardware, system architecture and software used in pursuit of these goals. 3. HARDWARE AND SYSTEM ARCHITEC- TURE The system design can be seen in Figure 2. The base of the observatory measures 120 cm by 80 cm and serves as a weight that keeps the observatory in place on the sea floor. The frame of the observatory is made of steel (marked with blue color on Figure 2) where the left part holds different materials for which environment absorption is being mea- sured. On the right part of the frame we have mounted a watertight custom built containment unit made of transpar- Figure 3. System Architecture ent PMMA thermoplastic in a form of a tube that holds all the electronics of the observatory. The tube can withstand pressure of 3.5 bar (equivalent to pressure at the depth of To save the energy the system is turned off most of the 25 meters). This pressure capability is sufficient for planned time and boots up automatically on pre-configured inter- observations. However, tubes withstanding higher pressure vals. Battery life can be increased by prolonging the wake- can be mounted as well. The tube is locked on the frame up intervals, hence it is very important to calculate the time with two clamps and can be easily removed for data reading interval depending on the duration of the experiment. How- and battery changing. ever, as explained above, the power bank and SD card can be replaced at regular intervals by removing the tube from The electronics’ system architecture was designed with adapt- the frame and taking it out of the water, whilst keeping the ability, affordability and extensibility in mind, and it con- observatory underwater. This enables us to conduct unin- sists of three main modules, namely: the power supply mod- terrupted observations for longer. ule, the power regulator module, and the data logging mod- ule (see Figure 3). The modules do not require intermodule The Power Control module consists of an Arduino and an communication. RTC (Real-Time Clock) shield with a dedicated 3V battery for maintaining the RTC – hence it does not drain the power One of the main challenges of supporting off-line long-term from the power bank while in sleep mode. Once the RTC observations was to keep the battery powered for as long as triggers a wake-up signal, the Arduino turns on the Data possible. Our power supply consists of an array of Lithium Logging module by connecting it to the main Power Supply. cells forming a power bank. The power bank outputs a reg- ulated 5V at 2A. The total capacity of the power bank is Once the Data Logging module is active, the Raspberry Pi estimated at 50000mAh. The high capacity is needed to boots up and immediately reads the values from the sensors achieve the desired operating time considering the expected and writes all the data to an SD card. After a successful temperature at the seabed (which can drop to 5 degrees Cel- write the module shuts down by it self, ready to be turned sius). The temperature is expected to have a great impact on again at the next interval. Reading sensors can also be on the battery capacity [1] as can be seen in Figure 4. achieved with the Arduino board. However, in our case we 65 • Writes a new line to the measurements file (on SD card) with a timestamp and stores the image with the same timestamp. • Initiates a graceful shutdown Besides this simple Python script run in rc.local on the Rasp- berry Pi, we have written a simple script for Arduino PCB that regulates power supply to the Raspberry Pi. The pur- pose of the script is to ensure that the system does not drain unnecessary power after the shut down by cutting off the power supply to Raspberry Pi after two minutes. The pur- pose of the script is also to forcefully shut down the Rasp- berry if the shutdown process of the Raspberry Pi hangs during the shutdown process. Figure 4. Battery capacity & battery life compared All the code for both Raspberry Pi and Arduino is freely at different temperatures [1] available in our Git repository http://git.redmine.pint. upr.si/absorbcija-tujkov-v-morskem-okolju. needed a good camera module to take photos of the ma- 5. CONCLUSIONS terials observed in front of the tube, which Arduino lacks. By using mostly off-the-shelf open-source software and hard- Using Raspberry PI also expanded the GPIO pins available ware we designed and build a seafloor observatory system and provided several ways to build upon the existing plat- that is robust and reliable enough for obtaining time-series form. An abundance of modules and shields with plug & data sets for subsequent analysis after the system is retrieved play capabilities are available for both platforms. Adding from the seafloor (usually just the tube). Compared to new sensors and monitoring capabilities is therefore very other seafloor observatories, our SeaAbs costs just a frac- simple. tion of their price (less than 1800e or $2000). This makes it affordable for wider public and the research community 4. SOFTWARE enabling large scale marine environment data collation. Ef- For the described scenario to work, we use a stock operating fective data collection tools are indispensable for maritime system Raspbian that comes pre-installed on Raspberry Pi. research filed, such as: marine biology, sea ecology, physical It is a Linux distribution, which allowed us to take advan- oceanography, and conservation biology[3]. tage of its runlevels. These are the states of the system, indicating if the system in is the process of booting, reboot- We have currently built two observatories and are testing ing, single-user mode, shutting down, or in normal multiuser them before deploying them into North Adriatic sea. Both modes. When runlevel changes, init runs so called rc scripts observatories hold different materials for observing their ab- that take care of appropriate tasks for that runlevel. After sorption into marine ecosystem. We are focusing on mate- all the normal services start at the end of the process of rials that are abundantly available, comparatively inexpen- switching to multiuser mode, the init executes /etc/rc.local sive, and not too invasive in terms of releasing substances to start custom services. and (micro)particles into the sea. In the near future we are planing to deploy both observatories in two different loca- Our custom service is written in Python and follows the fol- tions observing 5 different materials on each. Current sys- lowing procedure, based on the sensors connected to GPIO tem lacks real-time data processing capabilities which we and the camera with the flash we use to take time-series of plan to add in future versions. Another limitation of our photos: observatory is also the fact that all components are placed underwater, hence can only be reached by scuba divers (re- quired when power bank and memory card need replace- • Reads the temeprature from the Waterproof DS18B20 ment). In future versions, we plan to look into ways of Digital temperature sensor automating the retrial of the tank holding electronics. • Reads the the luminosity (Lux) from the TSL2561 Dig- Irrespective of aforementioned limitations, the project has ital Luminosity/Lux/Light Sensor a long-term scope. Human interventions and operations are causing global and regional environmental and climate • Turns on the LED flesh made out of an array of ultra changes that may dramatically modify the structures and bright LED’s. functions of marine (eco)system. The altered resource po- tentials and ecosystem services in turn affect the livelihoods • Snaps a photo with of the illuminated observed ma- of the population depended on these ecosystems. Slovenia terials (we have placed the lens of the camera on the is a maritime country economically dependent on the state front tube cover facing the observed materials in order of the oceans. Therefore it needs to participate in active to avoid glare) monitoring of the oceans through maritime data collection which may contribute to regional as well as global societal, • Turns off the LED flash economic, political, and cultural changes. Our prototype, 66 as an inexpensive open source alternative to more expensive abyss. Springer Berlin Heidelberg, Berlin, Heidelberg, seafloor observatories, is a step in this direction. 2015. [5] S. Favali, P., Person, R., Barnes, C.R., Kaneda, Y., Furthermore, data that will be collected by our prototype Delaney, J.R., Hsu. Seafloor Observatory Science. In (and similar systems) offers an excellent starting-point for Proceeding of the 10th IEEE International Conference different simulations and models of various marine ecosys- on Advanced Learning Technologies ICALT010, 2010. tem aspects - one such example could be to use a combi- [6] L. Jacobson. Artificial Reefs, Sunken Ships, and nation of computer vision techniques [12] (to transform the Military Toxins. http://www.toxipedia.org/ captured pictures and/or video feed into numeric features) display/toxipedia/Artificial+Reefs,+Sunken+ and machine learning algorithms [10] (to build a mathemat- Ships,+and+Military+Toxins, 2011. ical model from the data) to model the affluence of vari- [7] T. Lado Insua, K. Moran, I. Kulin, S. Farrington, ous fish species to artificial reefs and thus improve our un- J. Newman, M. Riedel, G. Iturrino, W. Masterson, derstanding of artificial reef dynamics in large marine and C. Furman, A. Klaus, M. Storms, J. Attryde, freshwater systems. C. Hetmaniak, and D. Huey. The Successful Delpoyment of a New Sub-seafloor Observatory. In Acknowledgment American Geophysical Union Fall Meeting 2013, 2013. The initial stage of the project was funded by AdFutura [8] T. Lado-Insua, K. Moran, I. Kulin, S. Farrington, and through “Path to Practical Knowledge” programme, which J. B. Newman. SCIMPI: a new borehole observatory. was carried out with the support of the Ministry of Educa- Scientific Drilling, 16(16):57–61, nov 2013. tion, Science and Sport, and the European Social Fund. [9] T. Lado Insua, K. Moran, I. Kulin, S. Farrington, M. Riedel, M. Scherwath, M. Heesemann, B. Pirenne, 6. REFERENCES G. Iturrino, W. Masterson, and C. Furman. One year [1] Temperature effects on battery performance & life. of data of SCIMPI borehole measurements. In www.heliant.it/images/FV/ev_temperature_ American Geophysical Union Fall Meeting, 2014. effects.pdf, 2015. [10] T. M. Mitchell. Machine Learning. McGraw-Hill [2] M. Chaffey, L. Bird, J. Erickson, J. Graybeal, Education, New York, NY, USA, 1st edition, 1997. A. Hamilton, K. Headley, M. Kelley, L. McBride, [11] K. Moran, S. Farrington, E. Massion, C. Paull, E. Mellinger, T. Meese, T. O’Reilly, W. Paul, M. Risi, R. Stephen, A. Tréhu, and W. Ussler. SCIMPI: A new and W. Radochonski. MBARI’s buoy based seafloor seafloor observatory system. In OCEANS 2006, 2006. observatory design. In Oceans ’04 MTS/IEEE [12] M. Nixon and A. Aguado. Feature Extraction & Image Techno-Ocean ’04 (IEEE Cat. No.04CH37600), Processing for Computer Vision, Third Edition. volume 4, pages 1975–1984. IEEE, 2004. Academic Press, Oxford, UK, 3rd edition, 2012. [3] P. Favali and L. Beranzoli. Seafloor observatory [13] D. Turk, J. Book, and W. McGillis. pCO2 and CO2 science: A review. Annals of Geophysics, exchange during high bora winds in the Northern 49(2-3):515–567, 2006. Adriatic. Journal of Marine Systems, 117-118:65–71, [4] P. Favali, L. Beranzoli, and A. De Santis. Seafloor may 2013. observatories: A new vision of the Earth from the 67 Primerjava razdalj pri klasifikaciji in regresiji Tomaž Stepišnik Perdih Panče Panov Andrej Bauer Fakulteta za matematiko in Institut Jožef Stefan Fakulteta za matematiko in fiziko, Jadranska ulica 19, in Jamova cesta 39 fiziko, Jadranska ulica 19, Institut Jožef Stefan, Jamova Ljubljana, Slovenija Ljubljana, Slovenija cesta 39, Ljubljana, Slovenija pance.panov@ijs.si andrej.bauer@fmf.uni- tomaz.stepisnik@ijs.si lj.si Sašo Džeroski Institut Jožef Stefan Jamova cesta 39 Ljubljana, Slovenija saso.dzeroski@ijs.si POVZETEK in 1 kadar nista. V članku je narejena primerjava uspešnosti algoritma k naj- (0, če x = y bližjih sosedov ob uporabi različnih razdalj med vhodnimi d(x, y) = (1) podatki. Primerjana je uspešnost pri klasifikaciji in regresiji 1, sicer s standardno obliko vhodnih podatkov (n-terice z numerič- nimi ali nominalnimi komponentami). Pri numeričnih podatkih uporabimo absolutno vrednost raz- like, vendar moramo poskrbeti še za omejenost na interval Natančnost za posamezne razdalje je bila dobljena z 10- [0, 1]. To naredimo tako, da razliko pod določeno mejo za- delnim prečnim preverjanjem. S Friedmanovim testom je nemarimo in preslikamo v razdaljo 0, od neke meje naprej bila testirana hipoteza, da je algoritem z vsemi razdaljami pa obravnavamo kot maksimalno in preslikamo v razdaljo 1. enako natančen. V primeru zavrnitve hipoteze so bili pari Med tema dvema točkama pa linearno interpoliramo. razdalj primerjani še z Nemenyijevim testom. |x − y| − a d(x, y) = min(1, max(0, )) (2) b − a 1. UVOD Vrednost parametra a je bila pri testiranju vedno 0, vrednost Valentin Gjorgjioski v svojem doktoratu [4] naredi obširno b pa odvisna od učne množice, kot bo opisano kasneje. Za raziskavo o vplivu različnih razdalj na natančnost napove- lažjo predstavo je odvisnost razdalje od absolutne vrednosti dnega modeliranja. Posveti se predvsem razdaljam na struk- razlike prikazana na sliki 1. turiranih podatkih (časovnih vrstah ter množicah), nekoliko manj pa razdaljam na najpogostejši strukturi: n-terici. V tem članku bodo primerjani različni načini agregacije razdalj na komponentah n-terice v skupno razdaljo. Najprej bodo v drugem razdelku opisane razdalje, ki so bile primerjane. V tretjem razdelku je opisan algoritem k naj- bližjih sosedov in postopek različnih mer uspešnosti algo- ritma. Opisan je tudi način primerjave razdalj ter podat- kovne množice, na katerih so bile primerjane. V četrtem razdelku so predstavljeni rezultati testiranja. 2. OPIS RAZDALJ Slika 1. Numerična razdalja v odvisnosti od absolutne vre- dnosti razlike. Vhodni podatki so bili podatkovne množice n-teric z nu- meričnimi ali nominalnimi komponentami. Razdalja med Za razdaljo med n-tericama najprej izračunamo n razdalj dvema n-tericama je bila dobljena tako, da so bile najprej iz- med istoležečimi komponentami z zgoraj definiranima raz- računane razdalje med istoležečimi komponentami, nato pa daljama. Za skupno razdaljo potem izračunamo uteženo te razdalje po postopku urejenostno utežene agregacije (ang. potenčno sredino razdalj med komponentami. Denimo, da ordered weighted aggregation)[6] združene v skupno razdaljo. imamo n-terici x Opisati moramo torej računanje razdalje med numeričnimi 1, . . . , xn in y1, . . . , yn, pozitiven parame- ter p ter nenegativne uteži o in nominalnimi komponentami ter postopek agregacije. 1, . . . , on. Razdaljo med njima dobimo po naslednjih korakih. Za razdalje bomo zahtevali, da so simetrične (tj. d(x, y) = d(y, x)) in da slikajo na interval [0, 1]. Za nominalne kompo- 1. Če je vsota uteži O = P oi enaka 0, je tudi skupna nente uporabimo kar razdaljo, ki je 0, če sta podatka enaka, razdalja enaka 0. Sicer računamo naprej. 68 2. Izračunamo razdalje di med komponentami. RMSE (ang. root mean squared error ), RAE (ang. relative absolute error ) ter RRSE (ang. root relative squared error ). 3. Razdalje med komponentami uredimo po velikosti. Do- Napovedi naredimo za vsako mero natančnosti posebej, saj bimo u1, . . . , un od najmanjše do največje. se lahko pri različnih merah izbrani parametri k razlikujejo. 4. Vrnemo vrednost Napovedi naredimo za več različnih podatkovnih množic, 1 n ! 1 p primerjava razdalj pa poteka kot je predlagano v [3]. Za X oiup . (3) vsako mero uspešnosti s popravljenim Friedmanovim testom O i i=1 testiramo hipotezo, da je algoritem k najbližjih sosedov z vsemi razdaljami enako uspešen. Če je hipoteza zavrnjena, Utežem o z Nemenyjevim testom naredimo še primerjave posameznih 1, . . . , on pravimo urejenostne, saj utežujejo po na- čelu OWA [6]. parov razdalj. Vsi testi so bili narejeni s stopnjo zaupanja 0,05. Testirali smo 10 različnih razdalj, ki so se razlikovale veči- noma v urejenostnih utežeh. Skice uporabljenih uteževanj so predstavljene na sliki 3. Vse razdalje razen evklidske so imele parameter p enak 1, pri evklidski pa je bil 2. 3. METODOLOGIJA Pri poskusih smo vhodnim podatkom napovedovali njihove izhode s pomočjo učne množice. Delali smo tako klasifika- cijo (nominali izhodi) kot regresijo (numerični izhodi). Za napovedovanje je bil uporabljen algoritem k najbližjih so- sedov (kNN), znotraj katerega so bile uporabljene razdalje, Slika 2. Primer rezultatov Nemenyijevih testov. opisane v prejšnjem razdelku. Algoritem k najbližjih sosedov, opisan v algoritmu 1, da- Rezultate predstavimo z grafom, primer katerega je na sliki 2. nemu vhodnemu podatku poišče k najbližjih vhodov v učni Na skali je označen povprečen rang razdalje, dosežen na raz- množici in napove centroid njihovih izhodov. Pri klasifika- ličnih podatkovnih množicah. Črta pod skalo označuje, za ciji za izračun centroida naredimo uteženo glasovanje: vsak koliko se morata povprečna ranga dveh razdalj razlikovati, izmed k najbližjih vhodov glasuje za svoj izhod, utež pa da je razlika med njima statistično pomembna. Črte na je obratna vrednost njegove oddaljenosti od vhodnega po- grafu pa povezujejo razdalje, med katerimi ni statistično po- datka. Centroid je izhod z največ glasovi. Pri regresiji pa membnih razlik. V tem primeru lahko torej sklepamo, da je izračunamo uteženo povprečje izhodov, uteži pa so ponovno razdalja A boljša od razdalj D in C, ter da je B boljša od C. obratne vrednosti oddaljenosti. Če je oddaljenost katerega Med ostalimi pari pa ni pomembnih razlik. soseda 0, je njegova utež neskončna. V tem primeru vse neskončne uteži zamenjamo z 1, preostale pa z 0, da o cen- Za regresijo smo uporabili 10 podatkovnih množic, katerih troidu odločajo enakomerno sosedi z oddaljenostjo 0. lastnosti so prikazane v tabeli 1. Klasifikacija je bila nare- jena v dveh delih. V prvem delu je bilo uporabljenih 10 po- Algoritem 1 k najbližjih sosedov datkovnih množic z nizkimi dimenzijami vhodov (podobno Vhod: Učna množica T , razdalja na vhodih d, vhod x. kot pri regresiji), predstavljenih v tabeli 2. V drugem delu Izhod: Napoved za izhod od x. pa smo uporabili podatkovne množice s precej višjimi di- 1: Izberi k glede na T in d. menzijami vhodov. Bilo jih je prav tako 10, opisane pa so v 2: Izračunaj razdaljo d med x in vhodi v T . tabeli 3. Podatkovne množice z nizkodimenzionalnimi vhodi 3: Poišči izhode k-tih najbližjih vhodov. so bile pridobljene s spletne strani [2], množice z visokodi- 4: vrni: Centroid teh izhodov. menzionalnimi vhodi pa s strani [1]. Izbira množic je bila pretežno naključna, z nekoliko ozira na čim večjo raznolikost Primerjali smo uspšenost napovedovanja z različnimi razda- ljami v algoritmu kNN. Uspešnost smo merili z 10-delnim Tabela 1. Podatkovne množice za regresijo prečnim preverjanjem. Za vsako od desetih učnih množic je bil posebej določen parameter b pri numerični razdalji podat. množica primerov atributov (enačba 2) in sicer kot razpon vseh števil na tej komponenti pri podatkih v učni množici. Potem je bil glede na učno mno- quake 2178 3 žico izbran tudi parameter k. Z metodo ”izloči enega”(ang. pbc 418 18 leave-one-out ) dobimo napovedi za podatke v učni množici housing 506 13 za k med 1 in 50, ter izberemo tisti k, pri katerem je napaka meta 528 21 najmanjša (oz. natančnost največja). strike 625 6 abalone 4177 8 Uporabimo mere uspešnosti iz programskega orodja WEKA[5]. autoMpg 398 7 Za klasifikacijo uporabimo utežena povprečja mer natanč- hungarian 294 13 nost (ang. precision), priklic (ang. recall ) in f1 po razre- cpu 209 7 dih, za regresijo pa mere MAE (ang. mean absolute error ), pharynx 195 11 69 maximum stopnica sigmoida linearna povprečje,evklidska obrezana gauss trikotnik mediana Slika 3. Skice urejenostnih uteži. v številu primerov, atributov, pri klasifikaciji tudi razredov. Tabela 2. Podatkovne množice za klasifikacijo. podat. množica primerov atributov razredov iris 150 4 3 diabetes 768 8 2 credit-a 690 15 2 car 1728 6 4 primary-tumor 339 17 22 autos 205 25 7 mfeat-morpho 2000 6 10 vote 435 16 2 (a) MAE tic-tac-toe 958 9 2 credit-g 1000 20 2 Tabela 3. Podatkovne množice visokih dimenzij za klasifi- kacijo. podat. množica primerov atributov razredov amlPrognosis 54 12625 2 (b) RMSE bladderCancer 40 5724 3 breastCancer 24 12625 2 childhoodAll 110 8280 2 cmlTreatment 28 12625 2 dlbcl 77 7070 2 levkemija 72 5147 2 mll 72 12533 3 prostata 102 12533 2 srbct 83 2308 4 (c) RAE 4. REZULTATI 4.1 Regresija Pri regresijskih podatkih je Friedmanov test hipotezo ovrgel za vse štiri mere natančnosti. To pomeni, da med različnimi razdaljami obstajajo razlike v uspešnosti. Te so opisane z rezultati Nemenyjevih testov, predstavljenih na sliki 4. (d) RRSE Pri meri MAE je sigmoida boljša od mediane in maksimuma, Slika 4. Diagrami povprečnih rangov z rezultati Nemenyi- poleg tega pa so od mediane boljše tudi stopnica, linearna jevih testov za kNN z različnimi razdaljami na regresijskih in evklidska. Pri RMSE je sigmoida boljša od maksimuma, množicah. pri RRSE pa linearna od mediane. Pri RAE so od mediane boljše linearna, sigmoida in povprečje. Če gledamo samo povprečne range lahko vidimo, da nobena razdalja nima pre- cej večjega uspeha od ostalih, da pa sta ranga mediane in maximuma večinoma precej slabša od rangov ostalih razdalj. 70 (a) natančnost (b) priklic (c) f1 Slika 5. Diagrami povprečnih rangov z rezultati Nemenyijevih testov za kNN z različnimi razdaljami na klasifikacijskih množicah nizke dimenzije. (a) natančnost (b) priklic (c) f1 Slika 6. Diagrami povprečnih rangov z rezultati Nemenyijevih testov za kNN z različnimi razdaljami na klasifikacijskih množicah visoke dimenzije. 4.2 Klasifikacija like razdalje med komponentami (z izjemo maksimuma, ki je Tudi tukaj je Friedmanov test hipotezo zavrgel za vse mere. ekstremen primer teh razdalj). V večini primerov ni bilo ene Razlike v uspešnosti opisujejo rezultati Nemenyijevih testov razdalje, ki bi bila bistveno uspešnejša od ostalih, pogosto na sliki 5. pa sta mediana in maksimum odstopala v negativno smer. Testi pri meri natančnost niso pokazali nobenih statistično Delo je predvsem podlaga za nadaljnje raziskovanje. Primer- pomembnih razlik. Pri meri priklic je stopnica boljša od java razdalj se lahko naredi tudi v primerih, ko so vhodni maksimuma in mediane, od mediane pa so boljše tudi gauss, ali izhodni podatki strukturirani (zaporedja, množice, . . . ). sigmoida, povprečje in evklidska. Pri meri f1 sta gauss in Program, narejen za testiranje, omogoča algoritmu k naj- stopnica boljša od maksimuma in mediane. Na zadnjih treh bližjih sosedov, da glede na trening množico podobno kot mestih po povprečnih rangih so vedno obrezana, maksimum parameter k izbere tudi razdaljo, ki jo bo uporabljal za na- in mediana. povedovanje, izmed množice možnih. Več možnih razdalj načeloma poveča možnosti za dobro napovedovanje, vendar 4.3 Klasifikacija podatkov visoke dimenzije poveča časovno zahtevnost izbiranja razdalje. Zanimivo bi bilo na primer testirati, kateri nabori razdalj dobro delujejo Friedmanov test znova hipotezo zavrne pri vseh merah. Raz- skupaj. like v uspešnosti opišejo rezultati Nemenyijevih testov na sliki 6. 6. VIRI Rezultati so precej drugačni kot pri podatkih nizke dimen- [1] Bioinformatics laboratory. zije. Pri vseh treh merah je sedaj le maksimum statistično http://www.biolab.si/supp/bi-cancer/projections/. slabši od prvih nekaj razdalj. Če pogledamo povprečne Obiskano: 2016-06-24. range, vidimo da je res daleč za ostalimi razdaljami, ki so [2] UCI Machine Learning Repository. tukaj bolj na kupu kot v prejšnjih primerih. Zanimivo je, http://archive.ics.uci.edu/ml/. Obiskano: 2016-06-24. da se mediana odreže toliko bolje (pri meri f1 celo najbolje), [3] J. Demšar. Statistical comparisons of classifiers over saj ravno tako kot maksimum za skupno razdaljo vzame le multiple data sets. Journal of Machine Learning eno od več tisočih komponent. Research, 7:1–30, 2006. [4] V. Gjorgjioski. Distance-Based Learning from 5. ZAKLJU ČEK Structured Data. PhD thesis, Mednarodna podiplomska Primerjana je bila uspešnost algoritma k najbližjih sosedov šola Jožefa Stefana, Ljubljana, 2014. ob uporabi različnih razdalj med podatki, v primerih ko so [5] M. Hall, E. Frank, G. Holmes, B. Pfahringer, vhodni podatki n-terice z numeričnimi ali nominalnimi kom- P. Reutemann, and I. H. Witten. The WEKA Data ponentami, ciljna spremenljivka pa prav tako numerična (re- Mining Software: An Update. SIGKDD Explor. Newsl., gresija) ali nominalna (klasifikacija). Uporabljenih je bilo 11(1):10–18, Nov. 2009. več mer uspešnosti, merjenih z 10-delnim prečnim prever- [6] R. R. Yager and J. Kacprzyk, editors. The Ordered janjem, primerjava pa je bila narejena s Friedmanovimi in Weighted Averaging Operators. Springer, 1997. Nemenyijevimi testi. Testi so pokazali večjo uspešnost razdalj, ki bolj utežijo ve- 71 Avtomatska prepoznava teniških udarcev iz senzorskih podatkov Miha Mlakar Institut »Jožef Stefan« Jamova cesta 39 1000 Ljubljana miha.mlakar@ijs.si POVZETEK 2. PRIDOBITEV PODATKOV V tem članku smo prikazali postopek izdelave sistema za Za testiranje smo posneli šest različnih tekem s petimi različnimi avtomatsko prepoznavo teniških udarcev iz senzorskih podatkov. igralci. Ker nas je zanimala avtomatska prepoznava udarcev v Prikazali smo, kako je potekal zajem, kakšno je potrebno pred- realnem okolju, smo vse posnetke posneli med tekmami, ne med procesiranje podatkov, izračun potrebnih atributov in nato gradnja treningi v predvidljivem okolju. modelov. Modele smo ovrednotili na dva različna načina in prikazali, da dobljeni model avtomatske prepoznave udarcev Igralci so imeli na sebi napravo S5 podjetja Catapult Sports, ki so deluje zelo dobro. ga nosili na hrbtu med lopaticami v zato namenjeni oprijeti majici (slika 1) Pridobljeni sistem je mišljen kot začetni del celovitega sistema za analizo teniške igre iz senzorskih podatkov. Prepoznavanje tipov udarcev je osnova iz katere bodo nato izhajale prihodnje analize. Splošni izrazi Algoritmi, meritve, testiranje, poskusi. Ključne besede Senzorske meritve, tenis, pospeškomer, teniški udarci. 1. UVOD Uporaba oblačilnih senzorjev, ki jih športniki nosijo med športno vadbo je trenutno v velikem porastu. Različni senzorji in pristopi se uporabljajo v večini športov. Naprave, ki uporabljajo različne senzorje za izvajanje analiz in meritev v tenisu lahko razdelimo na tri skupine. Prva skupina so senzorji, ki se jih nosi na roki, oziroma so vgrajeni v teniški lopar. Ti senzorji nam prinašajo Slika1: Lokacija senzorja, ki smo ga uporabljali za meritve. Vir: informacije o udarcih, vendar je ta informacija brez konteksta in sporttechie.com je zato težko uporabna za izboljševanje teniške igre. Druga skupina so senzorji, ki se nosijo na hrbtu in poleg pospeškomerov Naprava S5 je vseboval naslednje senzorje: in žiroskopov vsebujejo GPS sprejemnik za zaznavanje premikov - Tridimenzionalni pospeškomter (osveževanje s igralca. Vendar pa se GPS izkaže za relativno nenatančnega, tako frekvenco 100HZ) da so napake velike od 3-5 metrov, kar je glede na velikost teniškega igrišča preveč, da bi bile lahko meritve uporabne za - Tridimenzionalni žiroskop (osveževanje s frekvenco določanje položaja na igrišču. Pole tega ti sistemi ne nudijo 100HZ) nobenih teniško prilagojenih meritev in analiz. Tretja skupina pa so sistemi, ki temeljijo na video analizi. Pri teh sistemih je - Tridimenzionalni magnetometer (osveževanje s potrebno postaviti 4-8 kamer okoli igrišča in potrebno je kupiti frekvenco 100HZ) pripadajočo programsko opremo. Zaradi tega so ti sistemi izredno - GPS senzor, ki je vračal pozicijo v obliki zemljepisna dragi in so vezani na igrišče in ne na igralca, tako da, če igralec širina in dolžina (osveževanje s frekvenco 10HZ) igra na nekem drugem igrišču, mu analize niso na razpolago. Kljub tem različnim možnostim pa na trgu trenutno ni senzorja, ki bi prepoznaval tako gibanje kot tenisko specifične metrike. Zato Na ta način smo posneli za več kot 6 ur posnetkov. Ker so bili smo vzeli obstoječi senzor S5 podjetja Catapult Sports in ga podatki pridobljeni 100 krat na sekundo smo tako pridobili uporabili za naše namene. Iz senzorja smo pridobili surove 2.172.363 podatkovnih vnosov. V tem času smo zabeležili 1.373 podatke, ki smo jih nato analizirali in izdelali avtomatski sistem udarcev. Vsak udarec smo označili kot eno izmed naslednjih za prepoznavanje teniških udarcev. Prepoznava udarcev nam bo možnosti: služila kot izhodišče za vse nadaljnje analize teniške igre, zato je - Servis kvalitetna prepoznava udarcev zelo pomembna. - Forehand 72 - Backhand Izračunana razlika med vrednostma kotnih hitrosti za osi 1 in 3 žiroskopa pri udarcu nam je služila kot začetno izhodišče pri avtomatskem določanju tipa udarcev. 3. PRED-PROCESIRANJE Na podlagi izrisanih signalov za pospeške in hitrosti obračanja Postopek pridobivanja spremenljivke, ki nam pomaga pri smo najprej poskusili ugotoviti kaj so ključne značilke za odkrivanju ali gre pri neki kombinaciji senzorskih vrednosti za diferenciacijo udarcev. Na sliki 2 lahko vidimo tipično sled udarec sledi sledečim korakom: pospeškomera in žiroskopa pri backhand udarcu. - izračunamo novo spremenljivko Spin, ki nam predstavlja absolutno vsoto kotnih hitrosti pridobljenih z žiroskopom - spremenljivko Spin potenciramo, da damo poudarek večjim vrednostim - apliciramo Butterworth pasovni filter [1] na spremenljivko Spin - negativne vrednosti popravimo na 0 - omejimo minimalno razdaljo med dvema vrhovoma (razdaljo smo eksperimentalno določili na 1,3 sekunde) Na ta način dobimo spremenljivko, ki nam označuje potencialne udarce in jo poimenujemo Peaks. Ta spremenljivka seveda ni dovolj za prepoznavo ali se je nek udarec zgodil v nekem trenutku ali ne, nam pa pomaga pri identifikaciji. V gradnjo modela je potrebno vključiti še dodatne spremenljivke, kot so povprečne vrednosti in standardni odkloni posameznih parametrov ter vsote in razlike pospeškov in kotnih hitrosti po različnih oseh. Slika 2: Vrednosti pospeškomera in žiroskopa skozi čas pri backhand udarcu označenemu z navpično črno črto. Kot primer je na sliki 4 prikazano, kako nam spremenljivki Peaks (prikazane so le vrednosti večje od 400) in hitrost premikanja Ko smo sled žiroskopa pri backhandu primerjali s sledjo pri pridobljena iz GPS podatkov razdelita prostor, kar nam pomaga forhendu smo ugotovili, da se ključno razlikujeta po vrednostih pri identifikaciji udarcev. kotnih hitrosti izmerjenih po osi 1 in 3. Na sliki 3 je prikazana razlika. Slika 4: Razdelitev prostora potencialnih udarcev glede na spremenljivki Peaks in hitrost premikanja. 4. TESTIRANJE Testiranje smo razdelili na dva dela. Najprej smo testirali, kako točno lahko napovemo ali in kdaj se je zgodil udarec. V drugem delu pa smo poskusili napovedati tudi tip udarca – forehand, backhand ali servis. Za gradnjo modela za napovedovanje udarcev smo v obeh primerih uporabili metodologijo naključnih gozdov (RF) [2]. Slika 3: Razlika v vrednostih kotnih hitrosti za os 1 in 3 za Vzeli smo implementacijo v programskem jeziku Python žiroskop pri backhand in forehand udarcu. implementirano v paketu scikit-learn. Metoda RF je vsebovala 10 73 naključnih dreves, ostalih parametrov pa nismo spreminjali, tako 5.2 Navzkrižno preverjanje s po enim da so imeli privzete vrednosti. izpuščenim igralcem Model smo testirali na dva načina. Pri prvem načinu smo za preverjanje točnosti modela uporabili metodo desetkratnega Z metodo navzkrižnega preverjanja s po enim izpuščenim prečnega preverjanja [3], ki je za deljenje podatkov znotraj igralcem smo dobili rezultate za natančnost in priklic ločeno za prečnega preverjanja uporabljala postopek uravnoteženega vsakega igralca. naključne delitve (Stratified shuffle split) [4]. Ta postopek zagotavlja, da so pri vsakem prečnem preverjanju razmerja med Rezultati klasifikacije za forehand udarec: razredi v testni in učni množici enaka. To nam zagotovi pravilno povprečje rezultata čez celotno prečno preverjanje. - Natančnost: 91.5% [94%, 91%, 84%, 96%, 92%] Pri drugem načinu smo testirali točnost modela po metodi - Priklic: 90.5% [91%, 88%, 86%, 91%, 96%] navzkrižnega preverjanja s po enim izpuščenim igralcem, ki Rezultati klasifikacije za backhand udarec: deluje po principu, da se model nauči na podatkih, ki ne vsebuje vrednosti izpuščenega igralca. Te vrednosti pa se nato uporabijo - Natančnost: 93.6% [98%, 98%, 84%, 91%, 95%] za testiranje točnosti modela. Ta postopek se ponovi tolikokrat, - Priklic: 90.6% [96%, 78%, 85%, 96%, 95%] kolikor imamo različnih igralcev. Tak pristop nam omogoča vpogled v univerzalnost modela, saj nam njegovo dobro delovanje Rezultati klasifikacije za servis udarec: pove, da bo model uspešno napovedoval udarce tudi za igralce, ki - Natančnost: 99.8% [100%, 100%, 100%, 99%, 100%] jih nismo vključili v testiranje. - Priklic: 98.2% [98%, 98%, 100%, 99%, 95%] Skupni povprečni rezultati preko vseh udarcev (in igralcev): 5. REZULTATI - Natančnost 95.0% 5.1 Metoda prečnega preverjanja - Priklic 93.1% Z metodo prečnega preverjanja smo klasificirali točnost Kot lahko vidimo iz rezultatov sta natančnost in priklic napovedovanja forehand, backhand in servis udarcev. pridobljena z metodama prečnega preverjanja in navzkrižnega Rezultati klasifikacije za forehand udarec: preverjanja s po enim izpuščenim igralcem podobna. To pomeni, da je model relativno neodvisen od tipa igralca, oziroma od - Natančnost 95.3% tehnike izvedbe udarcev in da model, ki smo ga naučili na ostalih - Priklic 91.4% udarcih dobro napoveduje tudi za izbranega igralca. Rezultati klasifikacije za backhand udarec: Glavni vir napak so posebnosti, ki se dogodijo med samo igro, kot je recimo udarec senzorja z loparjem, posebni gibi, ki jih ne - Natančnost 94.3% moremo klasificirati med izbrane tri udarce in pa posebni primeri hitrega obračanja igralca okrog svoje osi, kar rezultira v podobnih - Priklic 90.2% vrednostih senzorjev, kot jih imamo pri udarcih. Rezultati klasifikacije za servis udarec: Ker nam model zgrajen z RF ne omogoča prikaza kompleksnost - Natančnost 99.1% smo zgradili odločitveno drevo za detekcijo tipov udarcev (slika 5). Ker je drevo preveč kompleksno za podroben prikaz, - Priklic 99.3% prikazujemo le shemo drevesa, da prikažemo strukturo in Skupni povprečni rezultati preko vseh udarcev: kompleksnost dobljenega modela. Kot lahko vidimo je drevo razvejano in uporablja veliko različnih atributov v vozliščih. - Natančnost 96.2% - Priklic 93.6% 6. ZAKLJUČEK V članku smo pokazali postopek za gradnjo modela za 74 avtomatsko detekcijo teniških udarcev iz realni tekem in ne le [2] Liaw, Andy, and Matthew Wiener. "Classification and treningov v kontroliranem okolju. Vsak igralec je na hrbtu nosil regression by randomForest." R news 2.3 (2002): 18-22. posebno napravo s senzorji iz katerih smo nato s pred- [3] Kohavi, Ron. "A study of cross-validation and bootstrap for procesiranjem in izdelavo atributov zgradili model, ki nam je accuracy estimation and model selection." Ijcai. Vol. 14. No. služil za prepoznavo udarcev. Natančnost in priklic modela sta 2. 1995. bila nad 93%, kar je zadovoljiv rezultat. Z dodatnim pred- procesiranjem podatkov, ki bi odstranil posebnosti in pogladil [4] Balmin, Andrey, et al. "Stratified sampling using adaptive vrednosti senzorjev, pa bi natančnost in priklic lahko še povečali. parallel data processing." U.S. Patent Application No. 14/141,635. Pridobljeni model je dovolj dober, da nam bo lahko v prihodnosti služil kot izhodišče za nadaljnje raziskave in analize v tenisu. 7. LITERATURA [1] Selesnick, Ivan W., and C. Sidney Burrus. "Generalized digital Butterworth filter design." IEEE Transactions on Signal Processing 46.6 (1998): 1688-1694. 75 76 Indeks avtorjev / Author index Bauer Andrej ................................................................................................................................................................................ 68 Bohanec Marko .............................................................................................................................................................................. 5 Češnovar Rok ................................................................................................................................................................................. 9 Cvetković Božidara ...................................................................................................................................................................... 13 Dimitriev Aleksandar ................................................................................................................................................................... 17 Dovgan Erik ................................................................................................................................................................................. 21 Džerovski Saso ............................................................................................................................................................................. 68 Fabjan David A. ........................................................................................................................................................................... 60 Filipič Bogdan .............................................................................................................................................................................. 52 Frešer Martin ................................................................................................................................................................................ 13 Gams Matjaž .......................................................................................................................................................................... 44, 56 Gjoreski Martin ...................................................................................................................................................................... 13, 56 Gradišek Anton ............................................................................................................................................................................ 25 J. Kok Esther .................................................................................................................................................................................. 5 Janko Vito .................................................................................................................................................................................... 28 Javornik Anže ............................................................................................................................................................................... 32 Kljun Matjaž ................................................................................................................................................................................. 64 Kocjan Andraž ............................................................................................................................................................................. 25 Kononenko Igor ........................................................................................................................................................................... 32 Kosiedowski Michał ..................................................................................................................................................................... 13 Kužnar Damjan ............................................................................................................................................................................ 44 Luštrek Mitja .................................................................................................................................................................... 13, 28, 56 Mileva Boshkoska Biljana.............................................................................................................................................................. 5 Mlakar Miha ........................................................................................................................................................................... 25, 72 Panov Panče ................................................................................................................................................................................. 68 Pevec Darko ................................................................................................................................................................................. 32 Ploj Bojan ..................................................................................................................................................................................... 36 Robnik-Šikonja Marko ................................................................................................................................................................. 40 Šef Tomaž .................................................................................................................................................................................... 48 Stepišnik Perdih Tomaž ............................................................................................................................................................... 68 Štrumbelj Erik .......................................................................................................................................................................... 9, 17 Terbuc Martin ............................................................................................................................................................................... 36 Tošić Aleksandar .......................................................................................................................................................................... 64 Tušar Tea ...................................................................................................................................................................................... 52 W. Prins Theo ................................................................................................................................................................................. 5 Žitnik Lovro ................................................................................................................................................................................. 40 Zupančič Jernej ............................................................................................................................................................................ 44 77 78 Konferenca / Conference Uredili / Edited by Slovenska konferenca o umetni inteligenci / Slovenian Conference on Artificial Intelligence Matjaž Gams, Mitja Luštrek, Rok Piltaver Document Outline Blank Page Blank Page Blank Page 01_18.pdf 01 - Boshkoska 02 - sigproc-sp 03 - Cvetkovic 04 - sigproc-sp 05 - Dovgan 06 - Gradisek - HEA IS 2016 07 - Janko 08 - Javornik 09 - Ploj---IS2016_lektorirano 10 - ebook-recommendations-2016-revised 11 - Zupancic - metis_use_case Introduction Related work Experimental setup The data Model evaluation The grid Results Metis use-case results The grid experience Conclusion References 12 - Inteligentni_sistemi_IS2016_Šef_Tomaž 13 - Tusar 14 - GjoreskiMartin 15 - IS_2016_DAF_EN_v14 16 - Kljun sigproc-sp Introduction Motivation Hardware and System Architecture Software Conclusions References 17 - tomaz-stepisnik 18 - Avtomatska prepoznava teniških udarcev iz senzorskih podatkov - Miha Mlakar_corrected Blank Page Blank Page Blank Page