Zbornik 19. mednarodne multikonference

INFORMACIJSKA DRUŽBA - IS 2016

Zvezek A

Proceedings of the 19th International Multiconference

INFORMATION SOCIETY - IS 2016

Volume A

Slovenska konferenca o umetni inteligenci

Slovenian Conference on Artificial Intelligence

Uredili / Edited by

Matjaž Gams, Mitja Luštrek, Rok Piltaver

http://is.ijs.si

12. oktober 2016 / 12 October 2016

Ljubljana, Slovenia





Zbornik 19. mednarodne multikonference

INFORMACIJSKA DRUŽBA – IS 2016

Zvezek A



Proceedings of the 19th International Multiconference

INFORMATION SOCIETY – IS 2016

Volume A



Slovenska konferenca o umetni inteligenci

Slovenian Conference on Artificial Intelligence



Uredili / Edited by



Matjaž Gams, Mitja Luštrek, Rok Piltaver





12. oktober 2016 / 12 October 2016

Ljubljana, Slovenia



Uredniki:



Matjaž Gams

Odsek za inteligentne sisteme

Institut »Jožef Stefan«, Ljubljana



Mitja Luštrek

Odsek za inteligentne sisteme

Institut »Jožef Stefan«, Ljubljana



Rok Piltaver

Odsek za inteligentne sisteme

Institut »Jožef Stefan«, Ljubljana





Založnik: Institut »Jožef Stefan«, Ljubljana

Priprava zbornika: Mitja Lasič, Vesna Lasič, Lana Zemljak

Oblikovanje naslovnice: Vesna Lasič





Dostop do e-publikacije:

http://library.ijs.si/Stacks/Proceedings/InformationSociety



Ljubljana, oktober 2016





CIP - Kataložni zapis o publikaciji

Narodna in univerzitetna knjižnica, Ljubljana



004.8(082)(0.034.2)



SLOVENSKA konferenca o umetni inteligenci (2016 ; Ljubljana)

Slovenska konferenca o umetni inteligenci [Elektronski vir] : zbornik 19.

mednarodne multikonference Informacijska družba - IS 2016, 12. oktober 2016,

[Ljubljana, Slovenija] : zvezek A = Slovenian Conference on Artificial

Intelligence : proceedings of the 19th International Multiconference Information

Society - IS 2016, 12 October 2016, Ljubljana, Slovenia : volume A / uredili,

edited by Matjaž Gams, Mitja Luštrek, Rok Piltaver. - El. zbornik. - Ljubljana :

Institut Jožef Stefan, 2016



Način dostopa (URL):

http://library.ijs.si/Stacks/Proceedings/InformationSociety/2016/IS2016_Volume_A%

20-%20SKUI.pdf



ISBN 978-961-264-097-2 (pdf)

1. Gl. stv. nasl. 2. Vzp. stv. nasl. 3. Gams, Matjaž, računalništvo 4. Mednarodna

multikonferenca Informacijska družba (19 ; 2016 ; Ljubljana)

1537202883



PREDGOVOR MULTIKONFERENCI

INFORMACIJSKA DRUŽBA 2016



Multikonferenca Informacijska družba (http://is.ijs.si) je z devetnajsto zaporedno prireditvijo osrednji srednjeevropski dogodek na področju informacijske družbe, računalništva in informatike. Letošnja prireditev je ponovno na več lokacijah, osrednji dogodki pa so na Institutu »Jožef Stefan«.



Informacijska družba, znanje in umetna inteligenca so spet na razpotju tako same zase kot glede vpliva na človeški razvoj. Se bo eksponentna rast elektronike po Moorovem zakonu nadaljevala ali stagnirala? Bo umetna inteligenca nadaljevala svoj neverjetni razvoj in premagovala ljudi na čedalje več področjih in s tem omogočila razcvet civilizacije, ali pa bo eksponentna rast prebivalstva zlasti v Afriki povzročila zadušitev rasti? Čedalje več pokazateljev kaže v oba ekstrema – da prehajamo v naslednje civilizacijsko obdobje, hkrati pa so planetarni konflikti sodobne družbe čedalje težje obvladljivi.



Letos smo v multikonferenco povezali dvanajst odličnih neodvisnih konferenc. Predstavljenih bo okoli 200

predstavitev, povzetkov in referatov v okviru samostojnih konferenc in delavnic. Prireditev bodo spremljale okrogle mize in razprave ter posebni dogodki, kot je svečana podelitev nagrad. Izbrani prispevki bodo izšli tudi v posebni številki revije Informatica, ki se ponaša z 39-letno tradicijo odlične znanstvene revije. Naslednje leto bo torej konferenca praznovala 20 let in revija 40 let, kar je za področje informacijske družbe častitljiv dosežek.



Multikonferenco Informacijska družba 2016 sestavljajo naslednje samostojne konference:

•

25-letnica prve internetne povezave v Sloveniji

•

Slovenska konferenca o umetni inteligenci

•

Kognitivna znanost

•

Izkopavanje znanja in podatkovna skladišča

•

Sodelovanje, programska oprema in storitve v informacijski družbi

•

Vzgoja in izobraževanje v informacijski družbi

•

Delavnica »EM-zdravje«

•

Delavnica »E-heritage«

•

Tretja študentska računalniška konferenca

•

Računalništvo in informatika: včeraj za jutri

•

Interakcija človek-računalnik v informacijski družbi

•

Uporabno teoretično računalništvo (MATCOS 2016).



Soorganizatorji in podporniki konference so različne raziskovalne institucije in združenja, med njimi tudi ACM

Slovenija, SLAIS, DKZ in druga slovenska nacionalna akademija, Inženirska akademija Slovenije (IAS). V imenu organizatorjev konference se zahvaljujemo združenjem in inštitucijam, še posebej pa udeležencem za njihove dragocene prispevke in priložnost, da z nami delijo svoje izkušnje o informacijski družbi. Zahvaljujemo se tudi recenzentom za njihovo pomoč pri recenziranju.



V 2016 bomo četrtič podelili nagrado za življenjske dosežke v čast Donalda Michija in Alana Turinga. Nagrado Michie-Turing za izjemen življenjski prispevek k razvoju in promociji informacijske družbe bo prejel prof. dr.

Tomaž Pisanski. Priznanje za dosežek leta bo pripadlo prof. dr. Blažu Zupanu. Že šestič podeljujemo nagradi

»informacijska limona« in »informacijska jagoda« za najbolj (ne)uspešne poteze v zvezi z informacijsko družbo.

Limono je dobilo ponovno padanje Slovenije na lestvicah informacijske družbe, jagodo pa informacijska podpora Pediatrične klinike. Čestitke nagrajencem!



Bojan Orel, predsednik programskega odbora

Matjaž Gams, predsednik organizacijskega odbora



i

FOREWORD - INFORMATION SOCIETY 2016



In its 19th year, the Information Society Multiconference (http://is.ijs.si) remains one of the leading conferences in Central Europe devoted to information society, computer science and informatics. In 2016 it is organized at various locations, with the main events at the Jožef Stefan Institute.



The pace of progress of information society, knowledge and artificial intelligence is speeding up, but it seems we are again at a turning point. Will the progress of electronics continue according to the Moore’s law or will it start stagnating? Will AI continue to outperform humans at more and more activities and in this way enable the predicted unseen human progress, or will the growth of human population in particular in Africa cause global decline? Both extremes seem more and more likely – fantastic human progress and planetary decline caused by humans destroying our environment and each other.



The Multiconference is running in parallel sessions with 200 presentations of scientific papers at twelve conferences, round tables, workshops and award ceremonies. Selected papers will be published in the Informatica journal, which has 39 years of tradition of excellent research publication. Next year, the conference will celebrate 20 years and the journal 40 years – a remarkable achievement.





The Information Society 2016 Multiconference consists of the following conferences:

•

25th Anniversary of First Internet Connection in Slovenia

•

Slovenian Conference on Artificial Intelligence

•

Cognitive Science

•

Data Mining and Data Warehouses

•

Collaboration, Software and Services in Information Society

•

Education in Information Society

•

Workshop Electronic and Mobile Health

•

Workshop »E-heritage«

•

3st Student Computer Science Research Conference

•

Computer Science and Informatics: Yesterday for Tomorrow

•

Human-Computer Interaction in Information Society

•

Middle-European Conference on Applied Theoretical Computer Science (Matcos 2016)



The Multiconference is co-organized and supported by several major research institutions and societies, among them ACM Slovenia, i.e. the Slovenian chapter of the ACM, SLAIS, DKZ and the second national engineering academy, the Slovenian Engineering Academy. In the name of the conference organizers we thank all the societies and institutions, and particularly all the participants for their valuable contribution and their interest in this event, and the reviewers for their thorough reviews.



For the fourth year, the award for life-long outstanding contributions will be delivered in memory of Donald Michie and Alan Turing. The Michie-Turing award will be given to Prof. Tomaž Pisanski for his life-long outstanding contribution to the development and promotion of information society in our country. In addition, an award for current achievements will be given to Prof. Blaž Zupan. The information lemon goes to another fall in the Slovenian international ratings on information society, while the information strawberry is awarded for the information system at the Pediatric Clinic. Congratulations!



Bojan Orel, Programme Committee Chair

Matjaž Gams, Organizing Committee Chair





ii

KONFERENČNI ODBORI

CONFERENCE COMMITTEES



International Programme Committee

Organizing Committee

Vladimir Bajic, South Africa

Matjaž Gams, chair

Heiner Benking, Germany

Mitja Luštrek

Se Woo Cheon, South Korea

Lana Zemljak

Howie Firth, UK

Vesna Koricki

Olga Fomichova, Russia

Mitja Lasič

Vladimir Fomichov, Russia

Robert Blatnik

Vesna Hljuz Dobric, Croatia

Aleš Tavčar

Alfred Inselberg, Israel

Blaž Mahnič

Jay Liebowitz, USA

Jure Šorn

Huan Liu, Singapore

Mario Konecki

Henz Martin, Germany



Marcin Paprzycki, USA



Karl Pribram, USA

Claude Sammut, Australia

Jiri Wiedermann, Czech Republic

Xindong Wu, USA

Yiming Ye, USA

Ning Zhong, USA

Wray Buntine, Australia

Bezalel Gavish, USA

Gal A. Kaminka, Israel

Mike Bain, Australia

Michela Milano, Italy

Derong Liu, Chicago, USA

Toby Walsh, Australia





Programme Committee

Bojan Orel, chair

Andrej Gams

Vladislav Rajkovič Grega

Nikolaj Zimic, co-chair

Matjaž Gams

Repovš

Franc Solina, co-chair

Marko Grobelnik

Ivan Rozman

Viljan Mahnič, co-chair

Nikola Guid

Niko Schlamberger

Cene Bavec, co-chair

Marjan Heričko

Stanko Strmčnik

Tomaž Kalin, co-chair

Borka Jerman Blažič Džonova

Jurij Šilc

Jozsef Györkös, co-chair

Gorazd Kandus

Jurij Tasič

Tadej Bajd

Urban Kordeš

Denis Trček

Jaroslav Berce

Marjan Krisper

Andrej Ule

Mojca Bernik

Andrej Kuščer

Tanja Urbančič

Marko Bohanec

Jadran Lenarčič

Boštjan Vilfan

Ivan Bratko

Borut Likar

Baldomir Zajc

Andrej Brodnik

Janez Malačič

Blaž Zupan

Dušan Caf

Olga Markič

Boris Žemva

Saša Divjak

Dunja Mladenič

Leon Žlajpah

Tomaž Erjavec

Franc Novak

Bogdan Filipič





iii





iv



KAZALO / TABLE OF CONTENTS



Ime konference SLO / Ime konference ENG ............................................................................................................. 1

PREDGOVOR / FOREWORD ................................................................................................................................. 3

PROGRAMSKI ODBORI / PROGRAMME COMMITTEES ..................................................................................... 4

Usage of SIGMO, a Decision Support System for the Assessment of Genetical y Modified Organisms in

Food and Feed / Mileva Boshkoska Biljana, Bohanec Marko, W. Prins Theo, J. Kok Esther .......................... 5

Paral el Draws from the Polya-Gamma Distribution for Faster Bayesian Multinomial and Count Model

Inference / Češnovar Rok, Štrumbelj Erik ......................................................................................................... 9

Fit4Work – Monitoring and Management of Physical, Mental and Environmental Stress at Work /

Cvetković Božidara, Gjoreski Martin, Frešer Martin, Kosiedowski Michał, Luštrek Mitja ................................. 13

Bayesian Binary and Ordinal Regression with Structured Uncertainty in the Inputs / Dimitriev Aleksandar, Štrumbelj Erik ................................................................................................................................................... 17

Multiobjective Discovery of Driving Strategies Using the SCANeR Studio / Dovgan Erik ................................... 21

Ali nam metode strojnega učenja lahko pomagajo pri načrtovanju novih visokoentropijskih zlitin? /

Gradišek Anton, Kocjan Andraž, Mlakar Miha ................................................................................................. 25

Markov Chain Model for Energy-Efficient Context Recognition / Janko Vito, Luštrek Mitja ................................. 28

Reliability Estimation of Individual Predictions for Incremental Models / Javornik Anže, Kononenko Igor, Pevec Darko ..................................................................................................................................................... 32

Dve nadgradnji algoritma backpropagation / Ploj Bojan, Terbuc Martin .............................................................. 36

Ebook Recommendations Based on Stylometric Features / Žitnik Lovro, Robnik-Šikonja Marko ....................... 40

Model Selection on the JSI Grid: Metis Use-Case / Zupančič Jernej, Kužnar Damjan, Gams Matjaž ................ 44

Sinteza slovenskega govora na mobilni platformi Android / Šef Tomaž .............................................................. 48

Showing the Knee of a 4-D Pareto Front Approximation via Different Visualization Methods / Tušar Tea, Filipič Bogdan ................................................................................................................................................... 52

Machine Learning Method for Stress Detection with an EEG Device / Gjoreski Martin, Luštrek Mitja,

Gams Matjaž .................................................................................................................................................... 56

Design of Mobile Systems that Enable an Active and Meaningful Life for Dementia Patients / Fabjan

David................................................................................................................................................................. 60

SeaAbs: Developing a Low Cost Seafloor Observatory / Kljun Matjaž, Tošić Aleksandar .................................. 64

Primerjava razdalj pri klasifikaciji in regresiji / Stepišnik Perdih Tomaž, Panov Panče, Bauer Andrej, Džerovski Saso ................................................................................................................................................. 68

Avtomatska prepoznava teniških udarcev iz senzorskih podatkov / Mlakar Miha ............................................... 72

Indeks avtorjev / Author index ................................................................................................................................ 77





v





vi



Zbornik 19. mednarodne multikonference

INFORMACIJSKA DRUŽBA – IS 2016

Zvezek A



Proceedings of the 19th International Multiconference

INFORMATION SOCIETY – IS 2016

Volume A



Slovenska konferenca o umetni inteligenci

Slovenian Conference on Artificial Intelligence



Uredili / Edited by



Matjaž Gams, Mitja Luštrek, Rok Piltaver





12. oktober 2016 / 12 October 2016

Ljubljana, Slovenia

1





2

PREDGOVOR



Slovenska konferenca o umetni inteligenci je naslednica konference Inteligentni sistemi, ki je sestavni del multikonference Informacijska družba že od njenega začetka leta 1997. Za novo ime smo se odločili

zaradi tesnejšega sodelovanje s Slovenskim društvom za umetno inteligenco. Pomemben cilj tako

konference kot tudi društva je namreč povezovanje slovenskih raziskovalcev s tega področja, čeprav na konferenci niso nič manj dobrodošli tudi prispevki iz drugih držav. Žal letos nismo prejeli dovolj

kvalitetnih prispevkov iz tujine, smo pa pritegnili nadpovprečen delež prispevkov iz organizacij izven Instituta »Jožef Stefan«, posebej s Fakultete za računalništvo in informatiko, ki ima skupaj z Institutom vodilno vlogo pri raziskavah umetne inteligence v Sloveniji, kar štejemo za uspeh.



Tema konference ostajajo umetna inteligenca, inteligentni sistemi in inteligentne storitve informacijske družbe. Obravnavamo metode umetne inteligence, njene aplikacije v praksi, vpliv na družbo in

gospodarstvo, njeno prihodnost ter priložnosti in nevarnosti, ki jih vnaša v informacijsko družbo. Ker konferenca stremi h kakovostnim mednarodno zanimivim prispevkom, je večina izmed 18 sprejetih v

angleščini; a ker gre vendar za slovensko konferenco, ki želi spodbuditi dialog in sodelovanje med

domačimi raziskovalci, sprejemamo tudi slovenske. Vsi prispevki so bili recenzirani in popravljeni v skladu z navodili recenzentov. Menimo, da je letošnja kakovost prispevkov visoka in upamo na še višjo prihodnje leto, ko bo konferenca praznovala 20. obletnico.





FOREWORD



Slovenian Conference on Artificial Intelligence is the successor of the Intelligent Systems conference, which has been a part of the Information Society multiconference since its establishment in 1997. The new name was adopted because of a closer collaboration with the Slovenian Artificial Intelligence

Society. The reason for this collaboration is that the goal of both the conference and the society is bringing together Slovenian artificial intelligence researchers, although international papers are just as welcome at the conference. Unfortunately we have not received enough such papers this year, but we are proud to have attracted an unusually large number of papers from outside the Jožef Stefan Institute, particularly from the Faculty of Computer and Information Science, which shares the leading role in artificial

intelligence research in Slovenia with the Institute.



The topics of the conference remain artificial intelligence, intelligent systems and intelligent services in the information society. The conference addresses artificial intelligence methods, their practical

application, the impact of the artificial intelligence on the society and industry, its future, and the opportunities and threats it presents to the information society. Since the conference aims at international relevance, the majority of the 18 accepted papers are in English; however, being a Slovenian conference aiming to open a dialogue and collaboration between national researchers, papers in Slovenian are also accepted. All the papers were reviewed and revised according to the reviewers' comments. The quality of this year's papers is high, and we hope for an even higher one next year when the conference will celebrate its 20th anniversary.



Matjaž Gams, Mitja Luštrek, Rok Piltaver





3

PROGRAMSKI ODBOR / PROGRAMME COMMITTEE

Mitja Luštrek, IJS (co-chair)

Matjaž Gams, IJS (co-chair)

Rok Piltaver, IJS (co-chair)

Marko Bohanec, IJS

Tomaž Banovec, Statistični zavod

Cene Bavec, UP, FM

Jaro Berce, UL, FDV

Marko Bonač, ARNES

Ivan Bratko, UL, FRI in IJS

Dušan Caf, FIŠ

Bojan Cestnik, Temida

Aleš Dobnikar, CVI

Bogdan Filipič, IJS

Nikola Guid, FERI

Borka Jerman Blažič, IJS

Tomaž Kalin, DANTE

Marjan Krisper, FRI

Marjan Mernik, FERI

Vladislav Rajkovič, FOV

Ivo Rozman, FERI

Nikolaj Schlamberger, Informatika

Tomaž Seljak, IZUM

Miha Smolnikar, IJS

Peter Stanovnik, IER

Damjan Strnad, FERI

Peter Tancig, RZ

Pavle Trdan, Lek

Iztok Valenčič, GZ

Vasja Vehovar, FDV

Martin Žnidaršič, IJS





4





Usage of SIGMO, a Decision Support System for the

Assessment of Genetically Modified Organisms in Food

and Feed





Biljana Mileva

Marko Bohanec

Theo W. Prins

Esther J. Kok

Boshkoska

Jožef Stefan Institute

RIKILT Wageningen

RIKILT Wageningen

Faculty of information

Jamova 39, Ljubljana,

University & Research

University & Research

studies in Novo mesto,

and

6700 AE Wageningen,

6700 AE Wageningen,

Ljubljanska cesta 32,

University of Nova

Netherlands

Netherlands

Novo mesto, and

Gorica

theo.prins@wur.nl

Jožef Stefan Institute

marko.bohanec@ij

esther.kok@wur.nl

Jamova 39, Ljubljana

s.si





biljana.mileva@ijs.





si





ABSTRACT

products can reach the market and then be further on inspected. In

SIGMO is a web-based decision support system for assessing the

cases of finding unauthorized GMOs in products, these will be

probability of existence of a genetically modified organisms in food

withdrawn from the market, leading to significant financial losses

and feed products imported in the European Union. The system has

for the traders. To assist the traders and producers of complex

been developed in the frame of the EU FP7 project Decathlon. In

products, whose ingredients may contain GMOs, a decision support

this paper, we describe SIGMO, provide examples of how to use it

system called SIGMO (System for Identification of GMOs) has

for training and evaluation purposes, and report the statistics of its

been developed within the European Framework 7 project

actual use.

‘Decathlon’ (http://www.decathlon-project.eu/). In this paper, we

describe SIGMO, illustrate its application for training and

Keywords

evaluation through several examples, and report statistics of using

decision support system, qualitative multi-criteria modeling,

the system since it has been made available on-line.

genetically modified organisms, SIGMO



1. INTRODUCTION

2. SIGMO

The import of genetically modified organisms (GMOs) in the

SIGMO [6] is a web-based decision support system (DSS) that has

European Union (EU) has begun in 1994, and since then EU has

been designed to provide help to producers, traders and importers

developed different monitoring mechanisms to assess the risk of

with the aims to:

entering any GMO in its market. The goal of the risk assessment is

•

reduce the number of necessary GMO analyses;

to ensure that imported food or feed is safe for human and animal

health and the environment. Hence, the EU treats all genetically

•

better cope with the complexity of GMO market without

modified crops (GMO crops) as "a modified food". All GMO crops

requiring its users to have extensive knowledge on the

that are intended to be admitted to the EU, are extensively evaluated

world-wide production;

by the European Food Safety Authority (EFSA) [1], which reports

its findings to the European Commission (EC). The EC then

•

comply with EU GMO regulations in a cost-effective

way.

proposes whether to grant or refuse the authorisation for the GMO

[2]. In the EU, currently (September 2016) 68 GMOs are authorised

SIGMO is composed of:

in cotton (10 GMOs), maize (36), oil seed rape (6), soybean (15),

and sugar beet (1) [3]. Despite the large number of authorized

•

a data base providing data about GMO crop species

GMOs, the import of non-authorized GMOs is still being detected

produced and approved in counties worldwide;

in EU, in particular from the US, Canada, Argentina, China and

•

a multi-attribute model for the assessment of GMO

Brazil [4]. Whenever a product that contains GMO is imported, this

presence in food/feed products; and

information needs to be provided on the product label. There is a

threshold of 0.9% for the adventitious and technically unavoidable

•

an

on-line

user

interface

available

at

presence of authorized GMOs in non-GMO batches that do not

www.decathlon.ijs.si/gmo/.

require labelling [5, 6].



Regardless of the strict measures, it still happens that products



containing GMOs without labeling are imported into the EU. These

5



2.1 SIGMO database

Figure 1. Hierarchical structure of the SIGMO multi-attribute



model

The SIGMO database consists of tree tables that hold data on

The SIGMO model, as given in Figure 1, is implemented in DEXi

countries, the list of possible status types that a GMO-containing

software package [8]. To assess the likelihood of GMO presence in

product may obtain regarding the GM presence, such as “High

a product, the model is at the highest level decomposed into two

likelihood”, “Medium likelihood”, and “Low likelihood”; and a list

main subtrees:

of all currently listed GMOs species. These tree tables are

connected with three relations. The first relation, which defines the

•

TraceabilityData – data that accompanies each product

and describes its geographical origin and path in the

area of a particular crop planted in a certain country, determines if

a certain GMO/country pair belongs to a region of increased GMO

production/supply chain [9], and

production. The second relation holds information regarding each

•

AnayticalData – provide information about the approved

particular GMO event. For example, maize may have several gene

and/or unapproved GMOs already detected in the

modifications, each one having a different Event name. Such names

product, and about the methods used to analyze the

would be “MON810” encoded as “MONØØ81Ø-6”. Finally, the

products for GMO content [10, 11].

third one defines a many-to-many relation between Events and

Usually analytical data are rarely available with a product, but if

Countries, associated with a particular status type. At the current

they are available, they are potentially more relevant for assessing

implementation, the database holds over 300 GMOs.



the possible presence of (unauthorized) GMOs than traceability

data and should thus take precedence.

2.2 SIGMO multi-attribute decision model

A more detailed description of attributes and utility functions of the

The central part of SIGMO is a multi-attribute decision model that

model are given in [6].

provides an assessment of the potential presence of GMOs in

imported feed or food products. The model has been developed



using the methodology DEX (Decision Expert) [7]. DEX belongs

2.3 SIGMO user interface

to the group of qualitative multi-criteria decision making methods.

In DEX, a hierarchical structure containing qualitative attributes is

When using SIGMO (http://decathlon.ijs.si/gmo/), the user is

built which represents a decomposition of the decision problem into

presented with an interactive input form, such as the one presented

smaller, less complex and possibly easier to solve sub-problems.

in Figure 2, in which the user enters data about the product to be

There are two types of attributes in DEX: basic and aggregated. The

assessed. The system provides guidance through drop-down data-

former are the directly measurable attributes, which are used for

entry lists and info buttons on the right hand side of each data-entry

describing the decision options and represent input data to the

field.

model. The latter are obtained by aggregating the basic and/or other

aggregated attributes. They represent the evaluations of the options.

The hierarchical structure represents the dependencies among

attributes such that the higher-level attributes depend on their

immediate descendants in the tree. This dependency is defined by

a utility function by the expert.

Attribute tree

Attribute

Description

GM_Presence

Assessment of GM presence in food/feed products

TraceabilityData

GM presence due to traceability data

Products

GM presence due to product characteristics

ProductGMPresence

Assessment of GM presence in the product

CropGMPresence

Assessment of GM presence related to the crop

GeoGMPresence

Assessment of GM presence related to the geographical origin of the product

EU

Does the product originate in an EU country?

GM_Region

Does the product originate in a region of large GMO production?

ProductComplexity

Product type

Countries

Likelihood of GM presence due to the properties of countries and regions of origin

NumberCountries

Number of countries involved in storage

CountryGMPresence

Likelihood of GM presence with respect to involved countries

CoexistenceMeasures

Are coexistence measures in place in countries?

Transportation

Likelihood of GM presence due to transportation

PrepackagedProduct

Is product prepackaged?

Logistics

Likelihood of GM presence due to logistics

LogComplexity

GM presence due to logistics complexity

NumberInteractions

Number of interactions in the supply chain

NumberCompanies

Number of companies involved in logistics

LogStorage

Likelihood of GM presence due to storage used

Harbour

Has the product been shipped through harbour(s)?

Silo

Has the product been stored in silos?

SystemsUsed

GM presence due to used traceability systems

TraceabilitySystemInPlace

Is a traceability system in place?

IP_GMO

Are IP systems for GMO being used?

IP_Other

Are other IP systems being used?

AnalCtrl_Systems

Are there systems used that include analytical control?

PrivateContracts

Are there any private contracts?

AnalyticalData

GM presence due to analytical data

AnalyticalResults

GM presence according to analytical results

AnalyticalResultsAvailable

Are analytical results available?

ApprovedGMOsIdentified

Approved GMOs identified



UnapprovedGMOsIdentified

Unapproved GMOs identified

Methods

Risk due to applied methods

Figure 2. SIGMO user interface

ProcessingLevel

Processing level

AppropriateSampling

Have appropriate sampling methods been used?

AppropriateMethods

Have appropriate analysis methods been used?

Additionally, some parts of the input form are optional and open up

Reliability

Reliability of applied methods

ReliabilityForApprovedGMO

Reliability of applied methods to detect approved GMOs

RelevantGMCropsIncluded

Did the analysis include al the relevant GMO crops?

only when necessary. Such examples are:

Al IngredientsIncluded

Are al ingredients, listed on the product label, included in the analysis?

OmnipresentGMOsIncluded Are omnipresent GMO varieties included in the analysis?

NumberScreenElem

Number of screening elements

ReliabilityForUnapprovedGMO

Reliability of applied methods to detect unapproved GMOs

•

only when Product Complexity is “complex”, more than

NumberScreenElem

Number of screening elements

AppropriateDataAnalysis

Application of appropriate data analysis methods

one ingredient can be entered;

AppliedQualitySystem

Can we trust the applied methods and analytical lab?

ValidatedMethods

Application of validated analytical methods

AccreditedLab

Is analytical laboratory acredited?





•

when Analytical results available is “yes”, further detail

about analytical results are requested;

6



•

for products that are not pre-packaged, further data for

Figure 3. Cream Style Corn product from Thailand. Available

transportation is requested;

from

https://www.alibaba.com/product-detail/Cream-Style-

Corn_131818000.html

After the data has been entered, pressing the button “Evaluate”

shows the output page that displays the results of assessment. These



results can be further explored by “drilling down” the evaluation

3.3 Example 3

tree.

In this example we consider products that are made of several

SIGMO also provides a print page, which is suitable for saving and

ingredients, each of which may have different probability of GMO

printing the evaluation results, and a help page describing the

presence. For example, one may be tempted to buy pasta made form

meaning of input data fields.

maize and rice that are produced in Philippines. To assess the

probability of presence of GMOs, we use SIGMO to evaluate both

The system’s architecture was designed by employing concepts that

ingredients separately regarding the possibility of GMOs. As a final

will allow seamless scalability. In that manner, the system is

result, SIGMO aggregates the outputs of the evaluations into one

depends on MongoBD and MySql. The application is Django web

recommended answer. For the provided example, SIGMO

application that runs on Apache through mod_wgsi. As such the

evaluates the possibility of unauthorized genes in the selected

application can be easily scaled in order to accommodate

products as high. However, SIGMO also considers other attributes.

substantial traffic increase.

Considering the fact that the product is packed, the possibility of



commingling with other maize or rice is none. Therefore, the

3. USAGE OF THE SIGMO DSS

system evaluates the risk of GMOs in the product as medium to low.

Here we present several examples of how to use the SIGMO DSS



in order to answer important GMO related questions about

3.4 Example 4

imported feed and food products.

Consider that one wants to find out the GMO presence in complex



products consisting from rice and papaya produced in China. The

3.1 Example 1

product is packed, and comes with a documentation from which the

In the first example we answer the following question: given the

user may see that the product provides Analytical results from a

origin of a certain product that consists of only one ingredient (for

laboratory. The results show that no Approved or Unapproved

example popcorn), can one find all crops that may contain genetic

GMOs were identified by the laboratory. Hence a new drop-down

modifications? To answer this question we consider a product that

menu is shown that asks questions regarding the laboratory results.

is made of crops, and that originates from Indonesia. Running

The user answers as follows: “yes” for GM presence in country ,

SIGMO and checking the Crop Species field, we get the list of all

“moderate” for Processing Level, “yes” for Appropriate Sampling,

crops that are produced in Indonesia and which potentially may

“yes” for All Ingredients Included, “yes” for Omnipresent GMOs

contain unauthorized gene modifications in EU. The list contains

Included, “many” for Number of Screening Elements, “yes” for

Maize, Soybean and Sugarcane.

Appropriate Data Analysis, “yes” for Validated Methods, “yes” for

Accredited Lab. The next question is Is product pre-packed to



which the user says “no”. Hence, the user is asked a new set of

question about the product transportation. The answers are “few”

3.2 Example 2

for Number of interactions, “no” for Harbour, and “no” for Silo.

Another relevant question when importing products from outside

the EU is whether the country of origin has large fields that are

Given these data, SIGMO provides information on a-priori GM

populated with certain GMOs thus providing higher possibility for

presence in the product which is “high” for rice and “med” for

importing them in EU. For example, using the SIGMO model, one

papaya. Still, SIGMO’s final assessment of GMO likelihood is

may be interested whether or not Cream Style Corn canned product

v_low. The rationale is that the product has been inspected by a

(see Error! Reference source not found. ) that contains maize and

laboratory that did not find any authorized or unauthorized GMOs,

is produced in Thailand, is produced in a region of large GMO

and all additional questions regarding the laboratory indicate that

production. By selecting the appropriate values for Country and

the product has been properly inspected. Although the product has

Crop Species, SIGMO provides the result no.

not been packaged, it also has neither been stored in silos nor had

other interactions with other products at harbors. Thus, the final

recommendation from SIGMO is that there is a very low

probability that the product contains GMOs.



3.5 Usage of SIGMO

So far, SIGMO has been presented twice during the Decathlon

project to project participants, traders, producers and importers:

once in Lisbon, Portugal in 2015, and once in Shanghai, China in

2016. To measure the impact and usage of the system we have

anonymously collected the country of accession of the SIGMO’s

web page. A histogram of different accesses to SIGMO in the



7





period of 15.1. – 15.7.2016 is given in Figure 4. From the histogram

5. ACKNOWLEDGMENTS

we can see that although the system has been presented only to

The research leading to these results has received funding from the

participants

from

involved

project

countries

European Union’s Seventh Framework Programme for research,

(http://www.decathlon-project.eu/article/consortium), SIGMO has

technological development and demonstration under grant

been used worldwide. Additionally, we present the monthly

agreement

FP7-KBBE-2013-7-613908-Decathlon

distributions of web-accesses of SIGMO in the period from

(http://www.decathlon-project.eu).

December 2015 to September 2016 in Figure 5. The histogram

shows increases of web-access from less than 100 in December,

6. REFERENCES

2015 to more than 800 in July, 2016.

[1] European

Food

Safety

Authority.

2016.

http://www.efsa.europa.eu/.

[2] European Commission Fact Sheet. 2015. Questions and

Answers on EU's policies on GMOs Press release Database,"

http://europa.eu/rapid/press-release_MEMO-15-

4778_en.htm.

[3] European Commission. 2016. Genetically Modified

Organisams.

EU

register

of

authorised

GMOs.

http://ec.europa.eu/food/dyna/gm_register/index_en.cfm.

[4] European Commission. 2016. RASFF - Food and Feed Safety

Alerts. http://ec.europa.eu/food/safety/rasff/index_en.htm.

[5] European Commission. 2003. Regulation (EC) No



1829/2003 of the European parliament and of the council of

Figure 4. Different accesses points to SIGMO in the period of

22 September 2003 on genetically modified food and feed.

January, 2015 to July, 2016

Official Journal of the European Union, L268, 1 - 23.



[6] M. Bohanec, B. M. Boshkoska, T. W. Prins and E. J. Kok.

2016. SIGMO: A decision support System for Identification

of genetically modified food or feed products. Food Control,

168–177.

[7] M. Bohanec, V. Rajkovič, I. Bratko, B. Zupan and M.

Žnidaršič. 2013. DEX methodology: Three decades of

qualitative multi-attribute modelling. Informatica, 49 - 54.

[8] M. Bohanec. 2015. DEXi: Program for multi-attribute

decision making, User’s manual, version 5.00. Ljubljana:

Jožef Stefan Institute. IJS Report DP-11897. Ljubljana.

[9] European Commission. 2013. Regulation No. 1830/2003

concerning the traceability and labelling of genetically



modified organisms and the traceability of food and feed

Figure 5. Distribution of SIGMO web-site visits per month in

products produced from genetically modified organisms and

the period from December, 2015 to August 2016

amending Directive 2001/18/EC.

4. CONCLUSION

[10] A. Holst Jensen, 2009. Testing for genetically modified

The main goal of SIGMO is to help traders, producers and

organisms (GMOs): Past,present and future perspectives.

importers to assess the probability of existence of (un)authorized

Biotechnology Advances, 27, 71 - 1082.

GMOs in the feed and food products. Here we presented four

[11] A. J.Arulandhu, J. P. V. Dijk, D. Dobnik, A. Holst-Jensen, J.

examples of how to use the system to achieve its goal. In addition

Shi and J. Zel. 2016. Critical review: DNA enrichment

we presented the number of accesses to the system in a six month

approaches to identify unauthorised genetically modified

period. Although the system has been presented only in Portugal

organisms (GMOs). Analytical and Bioanalytical Chemistry.

and in China, it has been accessed from 22 countries. The presented

examples can be used in future also for demonstration purposes to





students studying for example decision sciences, in particular the

DEX methodology.





8





Parallel Draws from the Polya-Gamma Distribution for

Faster Bayesian Multinomial and Count Model Inference

Rok Češnovar

Erik Štrumbelj

University of Ljubljana

University of Ljubljana

Faculty of Computer and Information Science

Faculty of Computer and Information Science

Večna pot 113

Večna pot 113

Ljubljana, Slovenia

Ljubljana, Slovenia

rok.cesnovar@fri.uni-lj.si

erik.strumbelj@fri.uni-lj.si

ABSTRACT

logistic regression is a special case of Eq. (1) with all ni =

We propose an algorithm for drawing random variates from

1. Typically, Laplace approximation [1, p.

213] or the

the two-parameter Polya-Gamma distribution PG(h, z) in

Metropolis algorithm [1, p.

541] are used to infer from

parallel on GPU, when h is integer. This distribution plays

this model. Note that β are the k model coefficients, and

an important role in Gibbs-sampling schemes for inference

µ0, Σ0 the prior mean vector and prior covariance matrix,

from Bayesian categorical and count models. We demon-

respectively.

strate speedups of the order of 100 on mid-range and of the

order of 10 on low-end GPU and how these speedups trans-

Recently, Polson et al. [9] proposed a data-augmentation

late to similar speedups when the sampler is used in Gibbs

scheme that leads to a simple Gibbs sampler for the posterior

sampling for binomial regression.

of the Bayesian Binomial regression model Eq. (1)

w

Keywords

i|β ∼ PG(ni, βT xi)

(2)

β|w ∼ N

OpenCL, random variables, sampling, Bayesian inference

k (µw , Σw )

where µw = Σw(XT κ+Σ−1µ

µ

0

0), Σw = (X T W X +Σ−1

0

0)−1,

1.

INTRODUCTION

κ = y − ni , W is the diagonal matrix of w, and PG(h, z)

2

Bayesian methods play an important role in modern statis-

is the Polya-Gamma distribution, which is the focus of this

tics and machine learning due to their robustness and con-

paper and will be discussed in more detail in Section 2.

ceptual simplicity. The main drawback of Bayesian methods

is that it is often analytically and computationally difficult

This data-augmentation scheme in Eq. (2) is more efficient

to infer from the posterior distribution. Instead, we typically

than Metropolis samplers and other data-approaches (see

resort to structural approximation or, when we are inter-

[9] for details) and, which is even more important, easily ex-

ested in the full posterior density, computationally-intensive

tends to other categorical and count models, such as Multi-

sampling-based approximation, in particular, Markov Chain

nomial Logistic regression and Negative Binomial regression.

Monte Carlo (MCMC).

However, the Polya-Gamma distribution is not easy to sam-

For example, take the binomial regression setting with n

ple from and can be a performance bottleneck, in particular

samples and k input variables. Let yi ∈ {0, 1} be the de-

when h is large. In this paper we propose an algorithm for

pendent variable, xi ∈ Rk the input variables, and ni the

sampling from PG(h, z) in parallel on GPU, when h is in-

number of attempts for the i−th sample, i = 1..n. Even for

teger, which is sufficiently general for categorical and count

a standard and relatively simple model, such as the Bayesian

models. We demonstrate its utility as a stand-alone random

binomial regression model

variable generator and when used as part of a Gibbs sampler

for Bayesian inference.



1



yi ∼ Binomial ni, 1 + e−βTxi

(1)

2.

PG DISTRIBUTION AND SAMPLING

β ∼ Nk(µ0, Σ0)

In this section we briefly review the properties of the PG

neither the joint posterior nor the full conditional posterior

distribution relevant to our work. The Polya-Gamma distri-

distributions can be derived analytically. Note that Bayesian

bution PG(h, z) is a two-parameter continuous distribution

(h > 0, z ∈ R). The first important property of the PG

distribution is that if X ∼ PG(hx, z) and Y ∼ PG(hy, z)

then X + Y is distributed PG(hx + hy, z). Therefore, for

integer h, a random variate from PG(h, z) can be generated

using h random variates from PG(1, z).

The above property allows us to focus on sampling from

PG(1, z). Our parallel algorithm is based on the algorithm

from [9], which is based on sampling from the (exponentially

tilted) Jacobi distribution J∗(1,z) [4] and the relationship

9





that if X ∼ J∗(1,z/2) then Y = X/4 ∼ PG(1, z). Therefore,

parallelizing for the GPU, we are dealing with the trade-off

our sampling problem reduces to sampling from J∗(1,z).

between the demands for a very large number of threads and

the demand that the threads have a large enough workload

The density f (x) of a J∗(1,z) distributed random variable

so that the cost of creating threads does not outweigh the

contains an infinite sum and can only be approximated (the

benefits of parallel execution. We also want to have simi-

same holds for the density of PG(1,z), of course), which

lar workloads in the threads, because uneven workloads are

makes sampling difficult. However, it has an alternating

huge performance drawbacks in GPU programming.

series representation

The naive approach would be to split the problem, so that

∞





X

z2x

f (x) =

(−1)icosh(z) exp −

a

each sampling i from PG(hi, zi) is a task. This way each

2

i(x),

i=0

task is independent, therefore requiring no communication.

However, we could have the problem of uneven workloads,

since threads that would perform draws with large hi would



n

o

have a substantially higher workload than threads with small

π(i + 1/2)( 2 )3/2 exp

− 2(i+1/2)2

0 < x < t

a

πx

x

h

i(x) =

i. Furthermore, with small R, the low number of threads

n

o

π(i + 1/2) exp

− 2(i+1/2)2π2 x

x > t,

would not fully utilize GPUs performance.

2

where the optimal choice of t = 0.64 [4].

To deal with these drawbacks we define a task to be a single

draw from PG(1, zi). The final draws PG(hi, zi) are calcu-

The partial sums of the series Si(x) = Pi

(−1)ja

j=0

j (x) sat-

lated on the CPU by summing hi samples from PG(1, zi),

isfy

as explained in Section 2. This way tasks are more ho-

S

mogeneous and the workload of threads more even, as the

0(x) > S2(x) > · · · > f (x) > · · · S3(x) > S1(x),

computational complexity of PG(1, zi) is practically inde-

which facilitates a modified rejection algorithm for sampling

pendent of zi. The number of tasks in this parallelization is

from f (x). The problem that f (x) can not be evaluated is

Rtask = Σli=1rihi. Note that Rtask > R if there are hi > 1,

circumvented by iteratively calculating the partial sums and

so we are able to better utilize the GPU when R is small.

stopping early using the above property of Si(x). Given an

upper hull (envelope) function g(x) > f (x), we can sample

The homogeneity of tasks also gives us the ability to easily

from f (x) by sampling from g(x) and accepting each sam-

group them in larger but still homogeneous workloads. Each

ples x0 with probability f (x0)/q(x0), which can be done by

thread thus performs batches of B draws from PG(1, zi)

sampling U ∼ U (0, g(x0)) and checking U < f (x0). Due to

instead of single draws. This ability is useful when searching

the property of the partial sums, we know that all odd terms

for the optimal point in the aforementioned trade-off. For

are less than f (x) - therefore, if U < Si(x0) for odd i, we

example, when Rtask is much greater than the number of

can deduce U < f (x) and accept the sample. Similarly, if

execution units C on the GPU. If each thread would execute

U > Si(x0) for even i, we can deduce U > f (x) and reject

a single draw, the benefits of parallelization would be small

the sample.

as Rtask-C threads would be assigned to the queue. The cost

of creating tasks would therefore outweigh these benefits. In

The first partial sum S0(x) is a natural (and efficient [9])

cases when C > Rtask, we could define B = 1, thus utilizing

choice for the upper hull function g(x) and it can be sampled

all the execution units. With the ability to easily fine-tune

from a mixture of a truncated inverse-Gaussian distribution

B we can get the best performance from GPUs of different

and a truncated exponential distribution

vendors and architectures. Note that parallelization within

(

the execution of single draws from P G(1, z

IG(|z|−1, 1)I

i) is not feasible

X ∼

(0,t]

with prob. p see [9]

as these parallel tasks would require a lot of communication,

Exp(−z2/2 + π2/8)I(t,∞] with prob.1 − p.

outweighing the benefits of parallel execution.

Left-truncated exponential variates can be generated triv-

ially by translating and scaling a Exp(1) variate. Inverse-

3.1

Pseudo-Random number generation and

Gaussian variates are sampled using the method of [5], which

sampling from probability distributions

is based on Exp(1) and standard normal N(0, 1) variates.

For further details on the above algorithm for sampling from

The algorithm for drawing from PG(1, z) requires random

J∗(1,z) and therefore PG(1, z) see [9]. Note that other, faster

variates from distributions U (0, 1), N (0, 1), and Exp(1). We

approximate approaches for sampling from PG have been

implemented the XORShift128 [7] uniform random number

developed [11], but are unstable for larger h.

generator, as this generator is one of the fastest on the GPU

and has satisfactory statistical properties [3, 6]. Unit uni-

3.

PARALLEL SAMPLING ON GPU

form variates U (0, 1) are generated trivially, by dividing the

generated number with the maximum possible uniform ran-

The basic case of sampling from the distribution is draw-

dom number. Standard normal variates N (0, 1) are gener-

ing r samples from PG(h, z). In this paper we focus on a

ated using U (0, 1) variates and the Box-Mueller transform

more generalized problem of performing l simultaneous sam-

[2]. Exponential variates Exp(1) are generated using the

plings, each with different ri, hi, and zi. The total number

U (0, 1) variates and the inverse transform method.

of samples to be drawn is R = Σli=1ri.

When parallelizing, the first step is to split the problem into

3.2

Data transfers

smaller tasks that require minimal amount of communica-

The input data required for sampling from P G(1, z) are:

tion. Each task is then assigned to a single thread. When

the array of all Rtask z values and the seeds for the random

10



number generator in each thread. Values z are floating-point

0.1

0.5

numbers while random seeds consist of 4 unsigned integer

2

values. The output data of the calculation on the GPU are

the Rtask real-valued samples from P G(1, z). The input data

l

m

1

size is therefore F × R

Rtask

task + 4 × 4 ×

bytes, while the

B

output data size is F × Rtask bytes, where F is 8 for double

0

precision or 4 for single precision. If the input and output

data size is larger than the size of global memory, execution

is split in smaller steps which has a negative effect on overall

−1

speedup. For a mid-range notebook GPU with 4GB of global

Config

memory, the execution has to be split for R

II

task > 227 in the

speedup)

1

2

(

worst case of B = 1 and F = 8.

I

10

2

log

3.3

Thread organization

1

Both modern and older GPU consist of a large number of

small and simple execution units, grouped together into a

smaller number of groups. These groups of units are termed

0

Streaming Multiprocessors (SM) in Nvidia and Computa-

tion Units (CU) in AMD/ATI graphics processing units. Ex-

−1

ecution units in a single CU/SM share more advanced logic

for scheduling, caches, registers, etc. How threads get as-

2

4

6

2

4

6

log

signed to these executions units, SM or CU depends on the

10( h )

thread organization. GPU threads are organized in work-

groups or blocks of the same size. Each block is assigned to

Figure 1: A plot of speedups (BayesLogit algorithm

a single SM/CU and the threads inside the block are exe-

time divided by GPU algorithm time) against size of

cuted in the SM/CU execution units.

h. A log-log scale is used to simplify interpretation.

Dashed line indicate for which n both algorithms

Threads in the same block therefore share the registers and

take the same amount of time.

caches. Local values for threads are stored in registers. If all

the registers are used, the local data for non-active threads

is cached or even stored in global memory, depending on the

ment is an indirect comparison of the sampling algorithms

architecture and the amount of data used. If we want to

through the use in Bayesian inference.

achieve optimal performance we want to avoid global mem-

ory access. Beside memory access, parallelization on the ex-

In the following two subsections, we describe the experi-

ecution unit level and SM/CU level is the other key factor

ments and report results on two different hardware con-

when determining thread block size. If we organize threads

figurations I (CPU: Intel Core i7-4790 @ 3.60GHz, GPU:

in a large number of small blocks we can underutilize the ex-

AMD Radeon HD 7970, RAM: 8GB DDR3 @ 1333 MHz)

ecution units in a single SM/CU. On the other hand if we or-

and II (CPU: Intel Core i7-5600U @ 2.60GHz, GPU: In-

ganize threads in fewer large blocks, some SM/CU could be

tel HD Graphics 5500, RAM: 12 GB DDR-3 @ 1333MHz).

underutilized while threads in other SM/CU would queue.

Note that thread-block (64) and batch (20) sizes were tuned

for best (on average) performance on hardware configura-

In our case, we can not define a constant thread-block size

tion I and used for both hardware configurations. All mea-

that would be optimal or even near-optimal for all possi-

surements were made using the microbenchmark package [8],

ble values Rtask, B and all architectures. For the exper-

taking the median value of 1000 repetitions.

iments done in Section 4, we fine tuned the thread-block

size together with the batch size B empirically. We reduced

4.1

Generating PG random variates

the search space for these two parameters based on detailed

knowledge of the architectures used.

We measured the running times of generating a single PG

random variate (r = 1) for increasing h = 2, 4, 8, . . . across

different z = {0.1, 0.5, 1.0, 2.0}. The number of samples r is

3.4

Implementation

set to 1. Varying both r and h is redundant, as increasing

We implemented the algorithm as part of a R package. Input

either one increases the key performance variable Rtask.

data checking and output data allocation was implemented

in R, while the rest of the algorithm was implemented in

Results are summarized in Figure 1 and show that for large-

C++ and OpenCL 1.2. OpenCL was used over CUDA for

enough h speedups of over 100 and approximately 30 can

the purpose of being GPU-vendor agnostic. OpenCL 1.2 was

be achieved on configurations I and II, respectively. Results

used over 2.x in order to be compatible with more GPUs.

also show that the value of z does not have a substantial

influence on running times.

4.

EMPIRICAL VALIDATION

We performed two experiments. The first experiment is a di-

For small h, running times of the GPU algorithm can be

rect comparison of our parallel GPU algorithm and existing

up to 10 times slower than the BayesLogit implementation

CPU implementations for sampling from the PG distribu-

and Rtask of over 10000 is required to achieve at least some

tion from the BayesLogit package [9]. The second experi-

speedup. This is due to the overhead of building and trans-

11





expected on high-end specialized GPU. Furthermore, the

Table 1: Estimated running times and speed-ups

measurements in our paper regarding the size of h required

for a single iteration of Gibbs sampling on the

to achieve speedup and maximum speedup are pessimistic,

NBAplayers data set. Results are for each hardware

because they include, for each sampling, the overhead for

configuration (I and II) separately and broken down

transferring data and code to the GPU, which would only

into three parts: sampling from Polya-Gamma, sam-

have to be done once.

pling from multivariate normal, and matrix oper-

ations. Standard deviations are included, because

Additional speedup could be achieved by parallelizing the

times vary from iteration to iteration.

matrix operations and generating multivariate random vari-

Part

BayesLogit [µs]

GPU [µs]

Speedup

ates, for which efficient GPU implementations exist. Com-

PG

230423 ± 4745

5217 ± 120

44 ± 1

bining these into fully GPU-parallelized multinomial and

N

I

k

50 ± 2

50 ± 1

-

count models is delegated to future work. Another direc-

matrix

3212 ± 133

3107 ± 83

-

tion for future work is automatically tuning the two paral-

total

233685 ± 4790

8435 ± 160

27 ± 0

lelization parameters (thread-block size, batch size) to the

PG

127482 ± 2525

6793 ± 112

18 ± 0

architecture of the GPU and Rtask. This can be done either

N

II

k

53 ± 1

54 ± 1

-

offline, learning from performance on known configurations,

matrix

3969 ± 9

3996 ± 42

-

or online, gradually adapting the parameters based on per-

total

131504 ± 2522

10844 ± 101

12 ± 0

formance, which is particularly suited to Gibbs sampling, as

several iterations are made with the same values of hi.

ferring the OpenCL kernel and the transfers of input and

6.

ACKNOWLEDGMENTS

output data. The cost of the former is larger but is also

This work was supported by the Slovenian Research Agency

one-time and thus negligible for large h.

(ARRS Applied Project L1-7542).

4.2

Binomial regression

7.

REFERENCES

We illustrate the benefits of GPU-accelerated draws from

[1] C. M. Bishop. Pattern Recognition and Machine

PG to Bayesian inference. We use the binomial regression

Learning. Springer-Verlag New York, 2006.

model from Eq. (2) to model the relationship between bas-

ketball players’ features and their career three-point shoot-

[2] G. E. P. Box and M. E. Muller. A note on the

ing percentages. The model was implemented in R, using

generation of random normal deviates. Ann. Math.

core R matrix operations and

Statist. , 29(2):610–611, 06 1958.

MASS package for generating

multivariate normal variates [10]. Zero µ

[3] V. Demchik and N. Kolomoyets. QCDGPU:

0 and unit diagonal

Σ

Open-Source Package for Multi-GPU Monte Carlo

0 were used as prior values.

Lattice Simulations. Computer Science, 1(1):13–21,

The data set contains 744 NBA basketball players and, for

2014.

each player, his total three points attempted and made,

[4] L. Devroye. On exact simulation algorithms for some

weight, height, minutes per game, and indicator variables

distributions related to Jacobi theta functions.

whether the player was a center or a forward. Therefore, a

Statistics & Probability Letters, 79(21):2251–2259,

total of 744 samples and 6 input features, including the con-

2009.

stant term (intercept). The average number of three-point

[5] D. Luc. Non-uniform random variate generation. NY:

shots attempted (n

Springer, 1986.

i in the model, hi in sampling) is 322.6

(ranging from 1 to 4270), leading to Rtask = 239988.

[6] M. Manssen, M. Weigel, and A. K. Hartmann.

Random number generators for massively parallel

Results are summarized in Table 1. Between-iteration vari-

simulations on GPU. The European Physical Journal

ability is relatively small, which is consistent with values

Special Topics, 210(1):53–71, 2012.

of z not having a strong influence on running times (the

[7] G. Marsaglia. Xorshift RNGs. Journal of Statistical

hi = ni are fixed for all iterations). The speedup for draws

Software, 8(1):1–6, 2003.

from PG (44 and 18 for configurations I and II, respectively)

[8] O. Mersmann. microbenchmark: Accurate Timing

is also consistent with results from the first experiment for

Functions, 2015. R package version 1.4-2.1.

Rtask = 239988. The example also illustrates how most of

[9] N. G. Polson, J. G. Scott, and J. Windle. Bayesian

the speedup in PG draws translates to total speedup, be-

inference for logistic models using Pólya–Gamma

cause the PG part represents the majority of the total time.

latent variables. Journal of the American statistical

This is true when n and number of input features k are

Association, 108(504):1339–1349, 2013.

small relative to Rtask, which is common in the practice of

[10] W. N. Venables and B. D. Ripley. Modern Applied

binomial, multinomial, and count regressions.

Statistics with S. Springer, New York, fourth edition,

2002. ISBN 0-387-95457-0.

5.

CONCLUSIONS

[11] J. Windle, N. G. Polson, and J. G. Scott. Sampling

In this paper we have parallelized the process of generat-

polya-gamma random variates: alternate and

ing Polya-Gamma random variates on GPU. Speedup were

approximate techniques. arXiv preprint

of the order of 100 on a mid-range non-specialized GPU,

arXiv:1405.0506, 2014.

which already makes it very useful both as a standalone sam-

pler and as part of a Gibbs sampling scheme for Bayesian

multinomial and count models.

Higher speedups can be

12





Monitoring and Management of Physical, Mental and

Environmental Stress at Work

Božidara Cvetković, Martin

Michał Kosiedowski

Mitja Luštek

Gjoreski, Martin Frešer

Poznań Supercomputing and

Jožef Stefan Institute

Jožef Stefan Institute

Networking Center

Department of Intelligent Systems

Department of Intelligent Systems

ul. Jana Pawla II 10, Poznań, Poland Jamova cesta 39, Ljubljana, Slovenia

Jamova cesta 39, Ljubljana, Slovenia

kat@man.poznan.pl

mitja.lustrek@ijs.si

boza.cvetkovic@ijs.si





ABSTRACT

health and comfort problems, and is most commonly linked with

In this paper, we present the design of the Fit4Work system for

air quality. It has been shown that in addition to health related

monitoring

and

management

of

physical,

mental

and

problems, the bad environmental conditions can cause decreased

environmental stress at work. In addition, preliminary results of

productivity. Inappropriate temperature decreases the productivity

the monitoring modules are presented. The goal was to design a

by 10%, humidity by 5% and air pollutants by 6-9%, which can

low-cost system that can use commercial sensors to monitor

result in a higher level of mental stress due to productivity

several aspects of the users’ lifestyle and to provide

pressure [1][2][3]. The objective of the Fit4Work system [4] is to recommendations for improving it. The system is designed for

detect sedentary, stressful and unhealthy environment and provide

older workers who are subject to sedentary stressful work in an

information in the form of recommendation to the user.

office environment. The preliminary results show that the system

The smartphone industry is constantly developing and with each

can sufficiently detect sedentary, stressful and unhealthy

new product the difference between the computing power of a

environment and provide information in the form of

computer and a smartphone is being blurred. Today’s average

recommendation to the user.

smartphone is equipped with sensors for motion monitoring (e.g.,

accelerometers, gyroscopes, GPS, etc.), ambiance monitoring

Categories and Subject Descriptors

(e.g., microphones, light, etc.) and with each next generation this

D.3.3 [Human-centered computing]: Ubiquitous and mobile

list is extended. If we take into account that by the end of 2017,

computing – ambient intelligence, mobile computing, ubiquitous

over a third of the world population is projected to own a

computing

smartphone (2.6 billion users) it makes it the most appropriate

device for mobile sensing and computing. Furthermore, with

increased availability of commercial sensors and even dedicated

Keywords

devices that can connect with a smartphone and sense

Physical activity monitoring, mental stress detection, environment

environmental parameters or physiological parameters of a

quality management, wearable sensors, ambient sensors

person, the possibility of applications has become almost infinite.

Monitoring of persons’ physical activity is not new and is in fact

1. INTRODUCTION

very popular in terms of number of smartphone applications,

The pace of life has been increasing with each day in the

dedicated devices and even smartwatch applications already

developed world, resulting in a lot of health risk factors. Most of

available on the market. Smartphone only applications for activity

the active population spends large amount of time at work in

monitoring use smartphone accelerometer to estimate the burned

closed environments under productivity pressure, which

calories and amount of movement based on the number of steps

contributes to different aspects of stress, such as unhealthy

the user takes over a day. Dedicated devices such as Microsoft

amount of sedentary activity, high mental stress and in many cases

Band 2 [6] use simplified activity recognition and machine-

low quality of the environment. If such lifestyle is maintained

learning to improve the estimated calorie burn. Apple Watch [7]

over longer period of time and even becomes a regular lifestyle of

goes one step further and contains hourly reminders if the person

a person it can severely contribute to development of other

is sedentary and provides feedback and motivation to reach the

behavioral changes and chronical or even deadly diseases.

daily goal. However, if the user wants to be monitored accurately

Physical stress or physical inactivity is the fourth leading cause of

it is required to input the current activity which is being

death worldwide. It contributes to increased mental stress,

performed. The applications and devices available on the market

development of cardiovascular diseases, diabetes, obesity and

are satisfactory for most of the active users who want to track their

other unhealthy behavioral changes. Exposure to mental stress at

activities such as running or just to get an insight into their

the workplace is not necessarily a negative thing since right

lifestyle. If the application/device also provides a feedback and a

amount of stress is needed for better productivity, however,

motivation the price gets too high, thus being too expensive for an

prolonged exposition to stress can result in chronical stress which

average person.

is a trigger for slower body recovery, other mental illnesses and

Monitoring mental stress using commercial and unobtrusive

vulnerability to infections due to decreased immune system.

devices is relatively new and challenging research topic. Healey

Environmental stress is related to overstaying in the environment

and Picard [8] were first to show that the stress can be detected with low quality conditions. The term Sick Building Syndrome

using physiological sensors which required intrusive wires and

(SBS) is used for buildings that cause the occupants various

electrodes. Hovsepian et al. [9] proposed cStress which

13





continuously monitors stress level using an ECG sensor, and were

utilizes data from the commercial weather station NetAtmo to

looking forward to use smartwatches in the future. With the

detect whether the quality of the environment has decreased. Each

advancement of the technological devices equipped with

of the data analysis components has its own recommendation

physiological sensors, such as Empatica [10] and Microsoft Band

component which utilizes results from the data analysis to

2 [6], these obtrusive methods could finally be implemented for

generate recommendation. The generated recommendations are

everyday use. Various studies exist, where researchers combined

sent back to the users’ smartphone.

signal processing and machine-learning to implement stress

detection. The studies were utilizing different sources of signals,

obtrusive sensors such as camera, microphone or unobtrusive

wearables [11] such as wrist device which is acceptable for most

of the people.

The monitoring and control of indoor environment parameters is a

popular topic in smart–building research where they mainly focus

on the trade-off between the occupants’ comfort in terms of

environment quality, and energy consumption [12]. In this

research it is prerequisite to have an automated building and fully

equipped rooms with HVAC systems which can be a substantial



investment and mainly indicates that most of the people living in

Figure 1. Fit4Work system for monitoring and management of

older buildings will not be able to use such system. To develop a

physical, mental and environmental stress overview.

system which does not need very expensive equipment one must

develop several virtual sensors for detection/estimation of



parameters that cannot be sensed directly and develop prediction

2.1 Physical Activities

models for environmental parameters. A correlation between the

The goal of the physical activity module is to recognize current

environmental values and the occupancy state or window state has

state of the user is terms of activity and expended energy.

been reported in previous research [13] which proves that it can Additionally, the output of the physical activity module is

be modeled. Estimation of the number of occupants was

aggregated and used for generating recommendations which will

successfully achieved with using the data of single or multiple

guide the user to successfully achieve daily and weekly activity

environmental parameters [14]. We did not come across any

goals.

relevant methods for window state detection. The prediction of

the dynamics of the parameters is usually done with the predictive

control technique [15] which requires the user to describe the

environment in detail (e.g., material of walls, windows, etc.)

which is not very user friendly.

The Fit4Work system will utilize sensors from an accelerometer

equipped smartphone, Microsoft Band 2 and a commercial

weather station NetAtmo [16] to monitor and detect increased

physical, mental and environmental stress at workplace and

provide recommendations through the dedicated smartphone

application which will assist in management of the stresses. To be

able to successfully achieve this goal we designed a system which

utilizes multiple machine-learning models to monitor physical

activity, mental stress of the user and predict the state of the



workplace and predict future dynamics of the environmental

Figure 2. Workflow of the physical activity monitoring data

parameters.

analysis component

The paper presents the design of the final Fit4Work system for

The physical activities method is composed of six tasks and is

monitoring

and

management

of

physical,

mental

and

presented in Figure 2. The devices we use are the accelerometer environmental stress, and refers to preliminary results of the

equipped smartphone and the Microsoft Band 2 wristband capable

modules which were obtained independently of other modules.

of measuring acceleration and physiological signals such as heart

rate, galvanic skin response, skin temperature and R-R interval.

2. FIT4WORK SYSTEM OVERWIEW

First, we check whether each of the two devices is present on the

The Fit4Work system for monitoring and management of

user’s body. Second, if the smartphone or wristband is on the

physical, mental and environmental stress is composed of three

body, we wait for a walking period. Walking is detected with a

data analysis and recommendation components and is presented in

machine-learning model in a location- and orientation-

Figure 1. The physical activity monitoring utilizes data from the independent manner. The walking signal is a prerequisite for

smartphone accelerometer and Microsoft Band 2 to recognize the

normalizing the orientation of the smartphone and the wristband

current activity of the user and to estimate the energy expenditure

and for recognizing the location of the smartphone. Third, when a

while performing the recognized activity. The mental stress

10-second period of walking is detected, we normalize the

monitoring utilizes data from the Microsoft Band 2 and physical

orientation and detect the location of the smartphone utilising

activity as recognized by the physical activities module to detect

location machine-learning model. If the smartphone ceases to be

the level of mental stress. The quality of the environment module

on the body because the user has taken it out of the pocket or bag,

the information on the orientation and location is no longer valid

14





and we have to wait until the next period of walking. Fourth,

distinguish between true stress and the many situations which

depending on the devices which are on the body and the location

induce a similar physiological arousal (e.g., exercise, eating, hot

in which the smartphone is, we invoke the appropriate

weather, etc.). By introducing the context-based classifier we can

classification model for activity recognition (AR) and regression

provide more information about the real-life circumstances and

model for estimating user's energy expenditure (EEE). For more

the user, thus to improve the detection performance.

information the reader is referred to [17].

The preliminary experiment of the module was done using the

In this paper we present preliminary results for AR (lying,

Empatica wristband and the results are presented in Table 1. For

walking, running, standing, sitting, cycling, mixed) and EEE with

more details the reader is referred to [21].

the wristband only if the wristband is present and smartphone

Table 2. Confusion matrix for leave one user out evaluation of

only if wristband is not present. The results are presented in Table

the mental stress detection module.

1. The experiments were performed on the data obtained with

Empatica in scenario presented in [17]. The results of AR is

No Stress

Stress

presented in terms of accuracy and the EEE in terms of mean

No Stress

790

23

absolute error (MAE). The results of the EEE are compared

Stress

51

63

against the commercial device SenseWear [19], one of the most

accurate consumer devices for EEE.

Future work includes investigating the relation between labelled

stress level, recognized stress level and cortisol levels and

Table 1. Preliminary results of the activity recognition and

personalization of the models.

estimation of energy expenditure.

2.3 Quality of the Environment

Smartphone





Monitoring of the quality of the environment detects worsening of

Wristband Trousers Torso Bags SenseWear

measured parameters and recommends appropriate actions which

AR [%]

88.4

87.9

78.9 79.5

/

will return to or keep the parameters in optimal range. The module

EEE [MET]

0.87

0.87

0.92 1.15

1.0

for monitoring and management of the quality of the environment

Future work includes use of both devices if both are present as

is composed of three components - a sensing component, an

done in our previous work [17].

ontology with a reasoner, and a simulator, as presented in Figure

4.

2.2 Mental Stress

The sensing component is composed of hardware sensors and

The stress monitoring module continuously monitors user’s stress

virtual sensors. The hardware sensors are the real sensors which

and provides overview of stressful events to the user.

measure the environmental parameters such as temperature,

Additionally, the output of the stress-monitoring module is used

humidity, CO2, noise, etc., while virtual sensors use machine –

by a stress-relief module for suggesting appropriate exercises at

learning models on the raw parameter values to estimate room-

appropriate time.

specific properties that cannot be sensed directly such as

For the stress-monitoring module we developed a machine

estimating the number of occupants and detecting the state of the

learning method which is applied on data collected via sensor-

devices. The outputs of the sensing component are fed into the

equipped device worn on the wrist, Microsoft Band 2. The

ontology, from which the reasoner infers actions that can improve

method for detection of mental stress is presented in Figure 3 and

the state of the environment, based on the current state and

consists of three machine-learning components (a base stress

present devices. The list of actions returned by the ontology is fed

detector, an activity recognizer and a context-base stress detector).

into the simulator. The simulator is composed of prediction

models and the quality rating module (Q–rating) which predicts



the environmental values for all combination of actions and

evaluates the resulting parameter states (values range between -1

and 1). The action resulting in the best state is finally

recommended by the system. The reader is referred to [22] for

more details.





Figure 3. Method for mental stress detection.

The laboratory stress detector is a machine-learning classifier

trained on laboratory data to distinguish stressful vs. non-stressful

events in 4-minute data windows with a 2-minute overlap. As

input it uses features computed from the physiological signals

(blood volume pulse, heart rate, R-R intervals, skin temperature



and electrodermal activity), and it outputs a prediction for

Figure 4. Method for monitoring and management of the

possible stress event. In addition, the activity is retrieved from the

quality of the environment.

physical activity monitoring module and provides a context

The preliminary results on the winter data are shown in Table 3.

information for the context-based stress detector. The context-

First two rows present the results of the virtual sensors, one for

based stress detector aggregates the predictions of the laboratory

detecting the window state and the second for estimating number

stress detector, uses context information and provides a prediction

of occupants, last three rows present the results of the prediction

every 20 minutes. The interval of 20 minutes was chosen

models. The result for the classification is presented in terms of

empirically. We decided to use the context-based classifier to

15

accuracy and results for the regression in terms of mean absolute

[5] Smartphone statistics,

error (MAE) and root mean squared error (RMSE).

https://www.statista.com/topics/840/smartphones/

The future work includes development of the virtual sensors for

[6] Microsoft Band 2, https://www.microsoft.com/

other devices, such as humidifier, air conditioning etc. and

[7] Apple Watch, http://www.apple.com/watch/

improvement of the prediction models by using additionally

collected data.

[8] Healey J. A., Picard R. W. 2005. Detecting Stress During

Real-World Driving Tasks Using Physiological Sensors.

Table 3. Results of the machine-learning models used for

IEEE Trans. Intell. Transp. Syst., vol. 6, no. 2, pp. 156–166.

virtual sensors (window state and number of occupants) and

for prediction of environmental parameters.

[9] Hovsepian, K., Absi, M., Kamarck, T., Nakajima, M. 2015.

cStress : Towards a Gold Standard for Continuous Stress



ACC

MAE

RMSE

Assessment in the Mobile Environment. In the Proceedings

E

V

E

V

E

V

of the 2015 ACM UbiComp, pp. 493-504, 2015.

Window state [%]

91

81

✗

✗

✗

✗

[10] Empatica, https://www.empatica.com/

No. of occupants

✗

✗

0.6

0.5

1.2

0.8

Predict T [℃]

✗

✗

0.4

0.5

0.5

0.6

[11] Muaremi, A., Bexheti, A., Gravenhorst, F., Arnrich, B.,

Predict H [%]

✗

✗

0.6

0.3

0.9

0.5

Tröster, G. 2014. Monitoring the Impact of Stress on the

Predict CO

Sleep Patterns of Pilgrims using Wearable Sensors. IEEE-

2 [ppm]

✗

✗

55

45

104

61

EMBS Int. Conf. Biomed. Heal. Informatics, pp. 3–6.

3. CONCLUSION

[12] Kumar, P., Martani, C., Morawska, L., Norford, L.,

We present a work-in-progress system for monitoring and

Choudhary, R., Bell, M., Leach, M. 2016. Indoor air quality

management of physical, mental and environmental stress which

and energy management through real-time sensing in

utilizes machine-learning models on the data from commercial

commercial buildings. Energy and Buildings 111, 145–153.

devices (accelerometer equipped smartphone, Microsoft Band 2

[13] Konuah, A. B. 2014. Occupant window opening behaviour:

and NetAtmo weather station) to detect the state of the user and

the relative importance of temperature and carbon dioxide in

recommend appropriate actions to improve users physical, mental

university office buildings. Ph.D. Dissertation. University of

and environmental state.

Sheffield.

The goal of the Fit4Work project is to develop a low cost system

[14] Candanedo, L. M., Feldheim, V. 2016. Accurate occupancy

which can be used in any environment (without the need to

detection of an office room from light, temperature, humidity

renovate or automate the environment). The preliminary results of

and CO 2 measurements using statistical learning models.

each module show that selected devices in combination with

Energy and Buildings 112, 28–39.

intelligent algorithms can sufficiently detect sedentary, stressful

[15] Kolokotsa, D., Pouliezos, A., Stavrakakis, G., Lazos. C.

and unhealthy environment and provide information in the form

2009. Predictive control techniques for energy and indoor

of recommendations to the user.

environmental quality management in buildings. Building

In the future, we plan to integrate the three modules into a single

and Environment 44, 9, 1850–1863.

system in form of a smartphone application and develop first

[16] NetAtmo, https://www.netatmo.com/

recommendation prototypes. We will upgrade the machine-

[17] Cvetković, B., Milić R., and Luštrek, M. 2016. Estimating

learning models for recognition of the activities and estimation of

Energy Expenditure With Multiple Models Using Different

energy expenditure to utilize both devices (the smartphone and

Wearable Sensors, in IEEE Journal of Biomedical and Health

the wristband) as seen in our previous work, personalize the stress

Informatics, vol. 20, no. 4, pp. 1081-1087.

detection method and upgrade its machine-learning models to

utilize other contextual information and enhance the environment

[18] Cvetković, B., Janko, V., Luštrek, M. 2015. Demo abstract:

quality monitoring with additional machine-learning models.

Activity recognition and human energy expenditure

estimation with a smartphone, PerCom 2015, pp. 193-195.

4. ACKNOWLEDGMENTS

[19] SenseWear, https://templehealthcare.wordpress.com/the-

This research was conducted within the Fit4work project co-

sensewear-armband/

financed through the AAL Programme.

[20] Palshikar, G.K. 2009. Simple Algorithms for Peak Detection

in Time-Series”. Ii the Proc. 1st Int. Conf. Advanced Data

5. REFERENCES

Analysis, Business Analytics and Intelligence, June.

[1] Kohl, H. W., Craig, C. L., Lambert, E. V., Inoue, S.,

[21] Gjoreski, M., Gjoreski, H., Luštrek, M., Gams. M. 2016.

Alkandari, J. R., Leetongin, G., Kahlmeier, S. 2012. The

Continuous stress detection using a wrist device: in

pandemic of physical inactivity: global action for public

laboratory and real life. In Proceedings of the 2016 ACM

health. The Lancet, vol. 380, no. 9838, pp. 294–305, 2012.

UbiComp '16

[2] WHO Mental Stress,

[22] Frešer, M., Cvetković, B., Gradišek, A., and Luštrek, M.

http://www.who.int/occupational_health/topics/stressatwp/

2016. Anticipatory system for T--H--C dynamics in room

with real and virtual sensors. In Proceedings of the 2016

[3] Lan, L., Wargocki, P., Lian. Z. 2012. Optimal thermal

ACM UbiComp '16.

environment improves performance of office work. REHVA

European HVAC Journal (January 2012), 12–17



[4] Fit4Work, http://www.fit4work-aal.eu/

16





Bayesian binary and ordinal regression with structured

uncertainty in the inputs

Aleksandar Dimitriev

Erik Štrumbelj

University of Ljubljana

University of Ljubljana

Faculty of Computer and Information Science

Faculty of Computer and Information Science

Večna pot 113, Ljubljana, Slovenia

Večna pot 113, Ljubljana, Slovenia

ad7414@student.uni-lj.si

erik.strumbelj@fri.uni-lj.si

ABSTRACT

that forecasting accuracy can be improved by first modelling

We propose a novel approach to binary and ordinal predic-

the uncertainty in the inputs and second taking this uncer-

tion with structured uncertainty in the input variables. It

tainty into account in the model. In this paper we focus

is based on efficiently approximating the prediction model

on developing an approach for the the second task, while

conditional on the inputs and then marginalizing the con-

addressing the first in a limited case.

ditional model over the input space using Monte Carlo ap-

proximation. For efficiency, the well-known Laplace approx-

In its essence, the problem is similar to modelling with mea-

imation is used for the binary case and we derive a similar

surement error, where input variables X (in our case, aggre-

approximation for the ordinal case. Empirical evaluation on

gated counts) are measured with error. This has received

sports data shows that the proposed approach substantially

a lot of attention [2, 3]. Measurement error models can

improves forecasting accuracy and highlights the severity of

roughly be split into two groups: functional or structural

the problem of uncertainty in the input variables in sports.

modelling [2]. Functional models make no or very few as-

sumptions about the distribution of X, while in structured

Keywords

models, a model is placed on X, typically a parametric one.

Laplace approximation, sampling, sports, inference

The difference between our case and a typical measurement

error settings is that sports teams recur, so training samples

1.

INTRODUCTION

are not independent or identically distributed, but we have

This work is motivated by a problem from sports forecasting

multiple measurements of the same input variable. This al-

[9] - predicting future outcomes from past performances. To

lows us to use a structural approach as described in section

illustrate the problem, imagine we are interested in predict-

3.1. However, we also need a model that can handle uncer-

ing the outcome of a basketball game between two teams.

tain inputs and this is the focus of the paper. In Section 2,

Typically, we would first compile a set of past games with

we propose a Bayesian approach to model the relationship

outcomes and relevant performance-related variables, which

between the noisy inputs and the outputs and incorporate

are typically count variables, such as shots made/missed, re-

uncertainty associated with the inputs in the form of prior

bounds, etc... Then we aggregate these variables across past

information. The empirical evaluation, data, and results are

performances and possibly transform them into variables

described in Section 3. With Section 4 we conclude the pa-

that are known to be good descriptors of team quality and

per and offer directions for further work.

predictors of future performance, for example, shooting per-

centage (total shots made divided by total shots attempted).

2.

METHODS

Finally, we would use some binary regression model to model

The standard prediction setting is the following: we are

the relationship between the inputs and the outcomes.

given a dataset X ∈ Rn×m, which consists of n observa-

tions and m input variables, and a target variable y ∈ Rn.

What has so far not been taken into account in related work

We then train a model that can produce predictions y∗ for

is that past performance variables are noisy and that the

a given new x∗ by using the training data X. We adopt

noise is transferred to any input variables we derive from

the Bayesian approach p(β|X, y) ∝ p(X, y|β)p(β), where β

them. Fitting a model and failing to account for this uncer-

are the parameters of the model. For the prediction setting,

tainty will result in being overconfident in the parameters of

we obtain the probability by marginalizing over the model

the model and subsequently the forecasts. We hypothesize

parameters:

Z

p(y∗|x∗, X, y) =

p(y∗|β, x∗)p(β|X, y) dβ.

In the above, the data are treated as constant. A more gen-

eral problem arises in our setting, where we instead treat the

inputs X and x∗ as random variables, with densities p(X|w)

and p(x∗|w∗), where w and w∗ are known constants. In this

case, to train a model and obtain its posterior distribution

p(β|w, y) we must marginalize over the input space of pos-

17





sible training data sets X:

for j ∈ {1, .., k − 1}, where β and αj are parameters. For

Z

Z

convenience, let α0 = −∞ and αk = +∞. The outcome

p(β|w, y) =

p(β, X|w, y) dX =

p(β|X, y)p(X|w) dX.

probabilities can then be written as

Similarly, to obtain a prediction for y∗ we must not only

P (Y = j|x) = P (Y ≤ j|x) − P (Y ≤ j −1|x)

marginalize over the parameters β, but also over the distri-

= σ(βx+αj)−σ(βx+αj−1), for j ∈ {1, .., k},

bution of the test sample x∗ as follows:

where σ is again the inverse logit function.

Z

Z

p(y∗|w,w∗,y) =

p(y∗|x∗, β)p(x∗|w∗)p(β|w, y) dx∗ dβ

x

The proportional odds model and its generalizations (see [7])

∗

β

(1)

have received very little attention in the Bayesian setting.

This approach is general, since it can be applied with any

We now derive the Laplace approximation to the posterior

model that produces a posterior probability distribution over

of this model. First, we place priors on the parameters.

its parameters p(β|X) and a distribution over its predictions

To ensure in-order intercepts, we introduce parameters dj,

p(y∗|x∗, X) for a given test sample x∗. What remains for the

j ∈ {1, .., k −1} and a stick-breaking re-parametrization of

model to be fully specified is defining the distributions that

the k−1 parameters aj with aj = Pj

= d

i=1

i. We place flat

generate the training and test data sets. The integral in

priors on the parameters p(di) ∝ 1 and all di are restricted

Eq. (1) will generally be intractable even for the simplest of

to be positive, except for d1. As in the binary case, we place

Bayesian models and is typically approximated using Monte

normal priors on the coefficients β1, ..., βm ∼ N (0, σβ). The

Carlo approximation E[y∗|w∗, w, y] ≈ 1 PN y∗ , where

model’s likelihood is

N

i=1

(i)

y∗ is a random sample from the posterior predictive distri-

(i)

n

k

m

(βi)2

bution in Eq. (1) and can be obtained by sequentially sam-

Y Y

Y

p(β, d|D) ∝

(R

e 2σ2β ,

pling X

i,j − Ri,j−1 )(yi=j)

(i) from p(X |w), β(i) from p(β|X = X(i) , y), x∗

(i) from

i=1 j=1

i=1

p(x∗|w∗), and finally, y∗(i) from p(y∗|x∗ = x∗(i), β = β(i)).

where Ri,j = σ(βxi + αj), and the log-likelihood

The densities p(X|w), p(x∗|w∗) represent our structural mea-

n

k

surement error model and are in most practical cases easy

1

X X

L = C −

(β ◦ β)+

I(y

to sample from efficiently. The densities p(β|X = X

i = j) log(Ri,j −Ri,j−1),

(i), y)

2σ2β

i=1 j=1

and p(y∗|x∗ = x∗(i), β = β(i)) are the posterior and posterior

predictive for the selected prediction model, conditional on

where I is the indicator function and C is a constant. The

the inputs being fixed, and we have to be able to efficiently

gradient of the log-likelihood cannot be written as succinctly

sample from them. This implies that we need an efficient

as is the case with logistic regression. Instead, we start with

prediction model or, in the case of Bayesian models, which

the derivative for some parameter θ:

are typically computationally intensive, an efficient struc-

n

k

tural approximation to p(β|X = X

∂

1

∂

∂

(i), y).

X X

L = −

(β ◦ β) +

L

∂θ

2σ2 ∂θ

∂θ i,j ,

β

i=1 j=1

2.1

Approximate logistic regression

Bayesian logistic regression is a discriminative model that

where ∂ L

∂θ

i,j = 0 if yi 6= j and

for a given example (x, y), y ∈ {0, 1}n, yields a probabilistic

prediction: P (y = 1|x) = σ(βT x), where σ(x) = 1/(1 + e−x)

∂

1

∂

∂

Li,j =

(

Ri,j −

Ri,j−1)

is the logistic function and β is the parameter vector of the

∂θ

Ri,j − Ri,j

∂θ

∂θ

−1

model. The typically-used prior is β ∼ N (µ



0, Σ0 ).

1

∂

=

R

(βx

R

i,j (1 − Ri,j )

i + αj )

i,j − Ri,j

∂θ

−1

A closed-form expression does not exist for neither its poste-



rior nor its predictive distribution. For efficiency, we use the

∂

− Ri,j

(βx

,

−1 (1 − Ri,j−1 )

i + αj−1)

well-known Laplace approximation (see [1], pages 213-215).

∂θ

The approximation to the posterior is the following Gaus-

otherwise. The derivation of the Hessian is omitted. It had

sian p(β|X, y) ∼ N (qmap, (Σ−1 + XT RX)−1), where R is a

0

little effect on the accuracy and running times - all results

diagonal matrix with entries rii = σ(qT

mapxi)(1 − σ(qT

mapxi))

reported in the empirical evaluation were obtained using a

and qmap is the posterior maximum, which can be obtained

numerical approximation to the Hessian.

with gradient methods.

3.

EMPIRICAL EVALUATION

2.2

Approximate proportional-odds model

We empirically evaluated our approach on (a) forecasting

We now derive a Laplace approximation to the proportional-

basketball game outcomes, which always have a winner and

odds model (ordinal logistic regression), which is the most

can be treated as a binary regression task, and (b) football

commonly used model for the ordinal setting [6]. Let n and

match outcomes, where draws are also possible, making it

m again be the be the number of samples, and input vari-

an ordinal regression task.

ables, respectively and k the number of (ordered) categories.

The model is based on the assumption that the odds of all

In both cases, the count data are first preprocessed to obtain

binary decisions between categories are proportional to each

uncertainty in the input variables, which is described in Sec-

other or, equivalently, that the k−1 logit surfaces are parallel:

tion 3.1. As a baseline for comparison, binary logistic regres-

P (Y ≤ j|x)

sion and ordered logistic regression are included, using mean

logit(P (Y ≤ j|x)) = log

= βx + α

P (Y > j|x)

j ,

counts. We also include the baseline with noisy test cases.

18





0.000

0.00

-0.002

-0.01

-0.004

-0.02

mean relative squared error (+/- SE)

mean relative rank probability score (+/- SE)

logistic reg.

logistic reg. + noise proposed (HMC) proposed (Laplace)

-0.03

model

ENG

FRA

GER

ITA

SPA

Figure 1:

Estimated prediction errors across all

competition

train-test season pairs in the NBA basketball data

ordinal reg.

ordinal reg. + noise

proposed (Laplace)

set. All errors are relative to the baseline for com-

parison (logistic regression).

Figure 2: Estimated prediction errors for each foot-

This is obtained by treating the test input variables as ran-

ball league and across all train-test season pairs in

dom, using the same structural model for the inputs as for

that league. All errors are relative to the baseline

the proposed model, and approximating the expected pre-

for comparison (ordial regression)

diction of the baseline models with Monte Carlo sampling.

For the binary case we also include the proposed model with-

out marginalization - jointly modelling β and X, including

A ∼ Gamma(α, θ) and B ∼ Gamma(β, θ), then X =

A

A+B

the uncertainty in the form of an informative prior on X.

is distributed X ∼ Beta(α, β).. Therefore, for a ratio vari-

We implement this model in the probabilistic programming

able derived from Poisson posterior rates of the form R =

P

language and tool for Bayesian inference Stan [8].

i λA,i

P

P

is Beta distributed R|¯

λ

λ

A,i, ¯

λB,i ∼ Beta(P ¯

λA,i, P ¯

λB,i).

A,i +

λB,i

The scale parameters for the θ are always the same, since the

Our empirical evaluation procedure is a straightforward mea-

second parameter in the Gamma distribution corresponds to

surement of out-of-sample forecasting accuracy, while re-

the number of games played, and it is the same for all counts

specting the time line. We use train-test season pairs, train-

at a particular point in time.

ing on one season and forecasting on the next. Only data

available prior to a match are used in forecasting it. We mea-

sure forecasting accuracy with mean squared error (MSE) in

3.2

Basketball data

the binary case and the rank probability score (RPS) in the

The basketball data used in our experiments consist of all

ordinal case [4].

the regular season and play-off games in the past 13 seasons

(12 train-test season pairs) of the National Basketball Asso-

3.1

Preprocessing count data

ciation (NBA) from 2001/02 to 2013/14. The data were ob-

tained from http://www.basketball-reference.com/. The

Input variables that are used as predictors in sports are typ-

count variables included in the data are counts of two-point

ically count variables or ratios of count variables, in particu-

shots made and missed, three-point shots made and missed,

lar ratios of the form

A

where A and B are sums of count

A+B

turnovers, offensive rebounds and defensive rebounds. We

variables. We will assume that the count variables follow in-

use these counts indirectly by transforming them into 8 ra-

dependent Poisson distributions and are time-homogeneous.

tios, described in [10], which are known to be good predictors

While a more sophisticated model for measurement error

of basketball match outcomes. Note that these ratios are

could be used, we focus on illustrating the benefits of taking

based on the well-known four factors effective field goal per-

into account the uncertainty in the inputs.

centage, rebounding percentage, turnover percentage, and

free throw percentage [5].

A natural choice for the prior distribution of the rate param-

eter λ is the Gamma distribution λ ∼ Gamma(a0, b0), which

is conjugate. Therefore, for each count variable with mean

3.3

Football data

¯

λi over ni games, the posterior is again Gamma λi| ¯

λi, ni ∼

Our football data set consists of 5 complete seasons (2010/11

Gamma(a0 + λini, b + ni), where we select weakly informa-

- 2014/15) of each of the top 5 European national club com-

tive priors a0 = b0 = 0.001. A sum of Gamma distributed

petitions: English Premier League, French Ligue 1, Ger-

random variables with the same scale is again Gamma dis-

man Bundesliga, Italian Serie A, and Spanish La Primera.

tributed with same scale.

Furthermore, if A and B are

That is, 4 train-test season pairs for each of the 5 leagues

Gamma distributed random variables with the same scale

for a total of 20. In addition to the outcome, we include,

19

ENG

FRA

GER

ITA

SPA

0.28

0.24

0.20

0.16

rank probability score (smoothed) 0.12

0

100

200

300

0

100

200

300

0

100

200

300

0

100

200

300

0

100

200

300

season progression (number of matches played)

ordinal reg.

ordinal reg. + noise

proposed (Laplace)

Figure 3: Prediction errors over the course of a season, averaged across all seasons and for each football competition separately.

for each match and each of the two teams that partici-

certainty is highest. This makes the proposed approach a

pated in the match, the number of goals scored, shots and

valuable contribution to the sports forecasting toolbox, but

shots on target, corners, fouls committed, and yellow and

it can also be applied in other similar domains. In this pa-

red cards received. The data were obtained from http:

per we focused on prediction with structured uncertainty in

//football-data.co.uk/data.php.

the inputs and put less emphasis on the estimation of the

uncertainty. Future work could entail developing a model

3.4

Basketball results

that allows the Poisson rates to vary over time and takes

The structural approximation variant of the proposed model

into account the possible correlations between them.

outperforms all other models (see Figure 1). These differ-

ences are not only statistically discernible, but also practi-

5.

REFERENCES

cally relevant (see [10] and references therein). Although

[1] C. M. Bishop. Pattern recognition and machine

the HMC-based approximation yields relatively good pre-

learning. springer, 2006.

dictions, it is discernibly worse than the structural approx-

[2] R. J. Carroll, D. Ruppert, L. A. Stefanski, and C. M.

imation. This can, at least in part, be explained by the

Crainiceanu. Measurement error in nonlinear models:

inferior accuracy of the HMC-based approximation due to

a modern perspective. CRC press, 2006.

slow mixing (effective sample sizes for β were, on average,

[3] W. A. Fuller. Measurement error models, volume 305.

∼ 20% of the total number of iterations and at 1000 itera-

John Wiley & Sons, 2009.

tions, the estimated MCMC sampling error was, on average,

[4] T. Gneiting and A. E. Raftery. Strictly proper scoring

approximately 10% of posterior standard deviation).

rules, prediction, and estimation. Journal of the

American Statistical Association, 102(477):359–378,

3.5

Football results

2007.

Football results are similar to basketball results, however,

[5] J. Kubatko, D. Oliver, K. Pelton, and D. T.

the proposed model outperforms the other models even more

Rosenbaum. A starting point for analyzing basketball

convincingly across all football competitions (see Figure 2).

statistics. Journal of Quantitative Analysis in Sports,

Compared to NBA basketball, football seasons are much

3(3), 2007.

shorter (in terms of matches per team) and there is more

[6] P. McCullagh. Regression models for ordinal data.

uncertainty in the inputs derived from match statistics. As

Journal of the royal statistical society. Series B

anticipated, the proposed model excels at the beginning of

(Methodological), pages 109–142, 1980.

each season and the differences between models’ prediction

[7] B. Peterson and F. E. Harrell Jr. Partial proportional

errors decrease as the season progresses and input variables

odds models for ordinal response variables. Applied

become more certain (see Figure 3). Similar results were

statistics, pages 205–217, 1990.

observed for basketball, but are omitted for brevity. Note

[8] Stan developement team. Stan Modeling Language

that the HMC variant of the proposed ordinal model was

Users Guide and Reference Manual, Version 2.8.0,

not included in the comparison on football data, because

2015.

the computation times make it infeasible for practical use.

[9] H. O. Stekler, D. Sendor, and R. Verlander. Issues in

sports forecasting. International Journal of

4.

CONCLUSION

Forecasting, 26(3):606–621, 2010.

Our hypothesis that ignoring the uncertainty in the input

[10] E. Štrumbelj and P. Vračar. Simulating a basketball

variables in sports will lead to less accurate predictions was

match with a homogeneous Markov model and

correct. The proposed model substantially outperformed

forecasting the outcome. International Journal of

typically-used models.

As expected, the improvement is

Forecasting, 28(2):532–542, 2012.

most substantial at the beginning of the season where un-

20





Multiobjective discovery of driving strategies using the

SCANeR Studio

Erik Dovgan

Artificial Intelligence Laboratory

Department of Intelligent Systems

Faculty of Computer and Information Science

Jožef Stefan Institute

University of Ljubljana

Jamova cesta 39, SI-1000 Ljubljana, Slovenia

Večna pot 113, SI-1000 Ljubljana, Slovenia

erik.dovgan@fri.uni-lj.si

ABSTRACT

the pollution reduction is proportional to the reduction of

This paper presents the SCANeR Studio simulation envi-

the fuel consumption. An interesting side effect of this is a

ronment, which is used to design future vehicles by several

reduction in traveling cost, although this is countered by an

worldwide companies. In addition, this environment can be

increase in traveling time. It is clear that heavily reducing

used to evaluate autonomous vehicle driving algorithms, al-

fuel consumption is not optimal because it leads to unaccept-

though it has several limitations. This paper describes the

ably long traveling time. Therefore, several objectives need

main environment limitations that are relevant for the eval-

to be taken into account simultaneously when constructing

uation of the Multiobjective optimization algorithm for dis-

a driving strategy.

covering driving strategies (MODS). Finally, a set of actions

for mitigating the SCANeR shortages is given.

To obtain autonomous vehicle driving, real-time driving stra-

tegies have to be applied by the autonomous vehicle driving

Categories and Subject Descriptors

algorithms. These strategies have to select the best control

action at each step with respect to the current vehicle and

J.7 [Computer Applications]: Computers in other sys-

route state, e.g., position on the route, position of neighbor

tems

vehicles, velocity of neighbor vehicles etc. The best control

action can be selected by implementing an algorithm that

Keywords

optimizes various objectives, such as the traveling time, the

driving strategies, multiobjective optimization, SCANeR Stu-

fuel consumption, the distance to other vehicles for colli-

dio

sion avoidance etc. Since several objectives have to be si-

multaneously taken into account, it is preferable to use the

multiobjective approach when developing the optimization

1.

INTRODUCTION

algorithm, since it is in general the case that multiobjec-

Autonomous vehicle driving is recently being investigated

tive algorithms exhibit advantages over single-objective al-

by many automotive and other companies, e.g., Ford [6],

gorithms (since they enable, e.g., better exploration of the

Mercedes-Benz [7], Toyota [12], BMW [5], Audi [8], and

multiobjective search space).

Google [13]. Google’s project of autonomous vehicles is led

by Sebastian Thrun, former director of the Stanford Ar-

To use the autonomous vehicle driving algorithms in prac-

tificial Intelligence Laboratory and co-inventor of Google

tice, the proper evaluation of these algorithms is of key im-

Street View, who wan the DARPA autonomous car com-

portance. Since untested algorithms cannot be simply de-

petition. Several driver assistance systems are already in-

ployed in autonomous vehicles and tested in real environ-

stalled in modern vehicles, such as line assist (see, e.g., Volk-

ment, e.g., city roads, the evaluation has to be performed in

swagen [15], Audi [1], and Toyota [14]). In addition, fully

the near-real-life driving simulation environment. Typically,

autonomous vehicles start driving in urban environments.

the best existing driving simulation environments are used

For example, 100 self-driving Volvo cars will drive on public

by the worldwide automotive companies. This paper exam-

roads around the city of Gothenburg by 2017 [16]. Till June

ines the SCANeR Studio simulation environment [9], which

2015, the Google autonomous vehicles have driven over 1

is used by several worldwide companies such as Renault,

million miles encountering 200,000 stop signs, 600,000 traf-

PSA Peugeot Citroën and VOLVO Trucks [10].

fic lights, and 180 million other vehicles.

The goal of the ongoing research is to enhance the existing

The development of an autonomous driving solution usually

Multiobjective optimization algorithm for discovering driv-

focuses on the determination of the vehicle surroundings,

ing strategies (MODS) [2, 3, 4] to take into account real-life

e.g., other vehicles, obstacles, and pedestrians, in order to

driving situations and to evaluate it with the SCANeR Stu-

increase safety and avoid collisions. However, such a strat-

dio. To this end, MODS will have to be significantly adapted

egy may miss to achieve other objectives that are also impor-

due to limitations of the SCANeR Studio.

tant. An objective that has to be taken into account is the

reduction of the environmental pollution. This objective is

The paper is further organized as follows. Section 2 describes

especially important since the awareness of the need to pro-

the existing Multiobjective optimization algorithm for dis-

tect our environment is increasing. When driving a vehicle,

21

covering driving strategies. The SCANeR Studion and its

various entities in the environment. To this end, it consists

limitations are presented in Section 3. Section 4 presents the

of several modules such as:

ongoing and future enhancements of the MODS algorithm.

Finally, Section 5 concludes the paper with the summary of

•

work.

Complex dynamic vehicle models simulating the be-

havior of every component of a real vehicle

2.

MULTIOBJECTIVE OPTIMIZATION AL-

• Simple models of autonomous traffic vehicles simulat-

GORITHM FOR DISCOVERING DRIV-

ing the behavior of other vehicles on the road

ING STRATEGIES

• Models for simulating the behavior of pedestrians

Multiobjective optimization algorithm for discovering driv-

ing strategies (MODS) [2, 3, 4] is a two-level multiobjec-

tive optimization algorithm that finds a set of nondominated

The behavior of these models can be predefined, controlled

driving strategies with respect to two conflicting objectives:

online with the scripting language or managed through APIs

traveling time and fuel consumption. The lower-level algo-

using custom algorithms. In addition, complex vehicle mod-

rithm is based on a deterministic breadth-first search, driv-

els can be controlled by the driver through the acquisition

ing prediction and nondominated sorting, and searches for

modules that can be connected to keyboards, gaming steer-

nondominated driving strategies. The search is performed

ing wheels etc. Vehicle driving is presented to the driver

by simulating vehicle driving by route steps. It starts with a

using visual and sound modules, and the module that con-

single driving strategy with no control actions. If for a route

trols dynamic platforms (if present). Moreover, the models

step the driving strategy does not define the control action,

involved in the simulation can be controlled/executed with

the driving strategy is cloned for each possible control ac-

either SCANeR user interface or programs that control the

tion and the obtained driving strategies are stored in the set

simulation, i.e., the supervisors.

of driving strategies. When the control action is defined, it

is used to predict the vehicle driving for a predefined num-

The SCANeR Studio enables to create a complete driving

ber of prediction steps. Due to the cloning, the number of

simulation environment through the usage of five dedicated

driving strategies grows exponentially. To maintain a con-

modes of the graphic interface:

stant number of driving strategies at each route step, fast

nondominated sorting is used to select the most promising

• Terrain mode: Road network creator allowing the rapid

driving strategies.

creation of realistic road networks that are useable di-

rectly in the simulation

The upper-level algorithm is an evolutionary algorithm that

optimizes the input parameters for the lower-level algorithm.

• Vehicle mode: Tool for fine-tuning and study of dy-

This algorithm applies evolutionary mechanisms, i.e., selec-

namic vehicle models

tion, crossover and mutation, to the sets of input-parameter

values through generations and maximizes the hypervolume

• Scenario mode: Driving simulator scenario editing tool

covered by the driving strategies found by the lower-level

• Simulation mode: Simulation supervision tool

algorithm.

• Analysis mode: Detailed graphical analysis tool

MODS was implemented in two variants. The original MODS

algorithm minimizes the traveling time and the fuel con-

Although The SCANeR Studio is a powerful tool for evalu-

sumption [2, 3], while the enhanced version, called MOCDS,

ating autonomous vehicle driving algorithms, it introduces

optimizes also the driving comfort [4].

some limitations for the development of these algorithms.

When taking into account the MODS algorithm, the SCANeR

Currently, the MODS algorithm is integrated and was eval-

Studio limitations presented Section 3.1 are the most critical

uated in a vehicle driving simulator that includes data from

ones.

real-world routes and a black-box vehicle simulator. The re-

sults show that MODS finds better driving strategies than

existing algorithms for discovering driving strategies. How-

3.1

Limitations of the SCANeR Studio

ever, MODS has some limitations. For example, it does not

Limited implementation of complex dynamic vehicle mo-

take into account neighbor/other vehicles on the route, it

dels

does not produce human-like driving strategies, and it does

The SCANeR Studio contains several types of complex dy-

not find (good) driving strategies in real time.

To over-

namic vehicle models such as cars, trucks and buses. The

come these shortages, the ongoing research consists of de-

instances of these models can be controlled only with throt-

sign and implementation of the enhanced MODS algorithm,

tle, braking and clutch pedals, gearbox, steering wheel etc.

and its deployment in the near-real-life environment, i.e.,

On the other hand, if an autonomous vehicle driving algo-

the SCANeR Studio simulation environment.

rithm controls the vehicle by setting vehicle’s velocity, such

algorithm cannot be evaluated with complex vehicle models.

3.

SCANeR STUDIO SIMULATION ENVI-

RONMENT

The SCANeR Studio contains also a large set of various

The SCANeR Studio [9] is a modular driving simulation

simple vehicle models. These models define some vehicle

environment, developed by the OKTAL company [11]. It

parameters such as length, weight etc., but do not simu-

simulates the vehicle driving behavior and the behavior of

late engine behavior in details. For example, they do not

22

simulate fuel consumption or engine limitations. Since no

The main issue of such behavior is the fact that the simu-

engine limitation is simulated, these simple models enable

lation results are not deterministic, i.e., the same scenario

to instantaneously change the vehicle velocity without lim-

with the same SCANeR modules does not always produce

itations, which results in unrealistic driving behavior. The

the same driving behavior and consequently the same travel-

advantage of these algorithms is that they can be controlled

ing time, fuel consumption etc. Therefore, an instance of the

by autonomous vehicle driving algorithms that set vehicle’s

autonomous vehicle driving algorithm has to be evaluated

velocity.

several times in order to statistically evaluate its quality.

Limited offline simulation

The synchronization issues are, however, present only in the

online simulation mode. On the other hand, the offline sim-

The SCANeR Studio enables to execute the simulation in

ulation mode does not have these issues, since it schedules

an offline mode that schedules the execution of the SCANeR

the execution of the SCANeR modules in advance and then

and other modules, involved in the simulation, with respect

executes them with respect to the predefined, determinis-

to their running frequency, and then executes them with

tic schedule. Consequently, the sequence of messages sent

respect to the schedule without any sleep time. This en-

between modules is also deterministic.

ables to execute the simulation as fast as possible, which is

specially suitable when the autonomous vehicle driving algo-

rithms control the vehicle behavior and no user interaction

3.2

Adaptation of MODS to the SCANeR Stu-

is required.

dio

The limitations of the SCANeR Studio significantly influ-

The offline simulation can, however, be used only when user

ences the enhancement of the MODS algorithm. Since the

executes such a simulation thought the SCANeR user in-

MODS algorithm controls the vehicle by setting vehicle’s ve-

terface.

On the other hand, when the simulation has to

locity, it requires to use SCANeR’s simple vehicle models.

be executed by a supervisor program, the offline simulation

Due to the fact that these models do not simulate engine be-

cannot be used. This represents a significant limitation for

havior in details, such engine behavior will have to be addi-

optimization programs that need to test various parameter

tionally implemented. This will require the implementation

settings of an autonomous vehicle driving algorithm, where

of, for example, wheel friction, aerodynamic, slope friction,

each setting has to be evaluated with the execution of the

inertial, maximum torque, specific fuel consumption, and

same simulation scenario.

maximum engine moving functions, in order to limit the ve-

hicle inertial force and calculate the fuel consumption.

Driving prediction is inapplicable

Several autonomous vehicle driving algorithms use the driv-

The MODS algorithm will require the usage of offline simu-

ing prediction technique to construct the driving strategy

lation in order to execute the multiobjective optimization al-

and select the best control action at each step. The usage

gorithm. Ideally, the multiobjective optimization algorithm

of driving prediction requires to periodically (ideally at each

should be implemented as a supervisor program. Due to the

step) stop the driving simulation, start the driving predic-

fact that a supervisor algorithm cannot control the offline

tion from the current vehicle and route state possibly mul-

simulation, the simulation will have to be controlled with

tiple times if various control actions have to be tested, and

HTTP POST calls.

Although controlling SCANeR with

then apply the obtained information during prediction in or-

POST calls is not optimal, this is the only mode that can

der to continue with the driving simulation. To execute such

be currently applied for the offline simulation.

a procedure, the main simulator has to simulate the driving,

while additional simulators have to be periodically executed

Driving prediction is one of the key mechanisms of the MODS

to predict the driving. Such a configuration is not supported

algorithm. Since this prediction is inapplicable in the SCAN-

by the SCANeR Studio, since SCANeR requires to execute

eR Studio, the MODS algorithm will have to be significantly

only one instance of SCANeR modules simultaneously. Oth-

redesigned. To compensate the missing driving prediction,

erwise, conflicts can appear in the communication between

a simple prediction approach will have to be added to the

modules if several instances of the same module are active

MODS algorithm, e.g., a set of rules that predict the future

simultaneously.

collision time.

MODS also requires exact evaluation of the driving strate-

Synchronization issues

gies. Since this can be achieved only by the offline simula-

The modules of the SCANeR Studio are implemented and

tion mode, only this mode will be used. Consequently, the

executed as independent processes that communicate be-

driving strategies obtained with the online simulation mode

tween them through network. Since they are executed as in-

will have to be re-evaluated with the offline simulation mode

dependent processes, their actual execution depends on the

before the comparison with the MODS driving strategies.

hardware and OS specifications, e.g., the number of proces-

sor cores that can simultaneously execute various processes.

4.

ENHANCEMENTS OF THE MODS AL-

Consequently, the sequence of messages sent between mod-

ules is not deterministic, which is also due to the network

GORITHM

latency. For example, messages from vehicle model contain-

The MODS algorithm is going to be significantly enhanced

ing the current vehicle velocity may sometimes be delayed

in order to take into account real-life driving situations and

with respect to messages for autonomous vehicle driving al-

to be evaluated with the SCANeR Studio. Currently, MODS

gorithm containing control actions, which may result in de-

is being adapted in order to control the vehicle’s velocity.

layed reaction of the autonomous vehicle driving algorithm.

In addition, the enhanced algorithm uses only the current

23

and the next velocity limits as hypercube dimensions of the

multiobjective optimization algorithm. Applied Soft

driving strategies [3]. The other attributes, such as the route

Computing, 16:50–62, 2014.

inclination, are added to the formulas that change the target

[4] E. Dovgan, T. Tušar, M. Javorski, and B. Filipič.

velocity. In the future, the hypercube dimensions will be

Discovering comfortable driving strategies using

redefined by taking into account the results of the analysis

simulation-based multiobjective optimization.

of the human driving strategies.

Informatica, 36:319–326, 2012.

[5] S. Elmer. BMW targets 2020 for self-driving cars,

To adapt to the SCANeR Studio limitations, the prediction

2013. Available online:

of the lower-level MODS is not used anymore. Neverthe-

http://www.autoguide.com/auto-news/2013/02/

less, a simple prediction will added to the MODS algorithm

bmw-targets-2020-for-self-driving-cars.html.

if needed. The lower-level MODS will be also enhanced to

[6] K. Fitchard. Ford is ready for the autonomous car.

simulate the car following, lane changing, and vehicle over-

Are drivers?, 2012. Available online:

taking behavior.

http://gigaom.com/2012/04/09/ford-is-ready-

for-the-autonomous-car-are-drivers/.

The upper-level MODS is going to be adapted to take into

[7] N. Ingraham. Mercedes-Benz shows off self-driving car

account the enhanced definition of the hypercubes. The new

technology in its new $100,000 S-Class, 2013.

parameters of the driving strategies, which will be stored in

Available online: http:

hypercubes, will include the vehicle velocity, acceleration,

//www.theverge.com/2013/5/18/4341656/mercedes-

deceleration, duration of acceleration, duration of decelera-

benz-shows-off-self-driving-car-technology.

tion, etc. In addition, the upper-level MODS will also be

[8] D. Johnson. Audi predicts self-driving cars by 2020,

enhanced to search for parameter values for the car follow-

2013. Available online:

ing, lane changing, and vehicle overtaking behavior.

http://www.leftlanenews.com/audi-predicts-

self-driving-cars-by-2020.html.

5.

CONCLUSIONS

[9] Oktal. Automotive simulators, 2016. Available online:

This paper presented the SCANeR Studio simulation en-

http://www.oktal.fr/en/automotive/range-of-

vironment and its limitations that are relevant for the en-

simulators/software.

hancement of the Multiobjective optimization algorithm for

[10] Oktal. Automotive simulators references, 2016.

discovering driving strategies (MODS) and its deployment in

Available online:

this Studio. Four main limitations were described: limited

http://www.oktal.fr/en/automotive/references-

implementation of complex dynamic vehicle models, limited

automotive-simulator.

offline simulation, inapplicability of driving prediction, and

[11] Oktal. OKTAL, 2016. Available online:

synchronization issues. The analysis of MODS and SCANeR

http://www.oktal.fr/.

Studio shows that MODS will need several adaptations be-

[12] R. Read. Toyota will roll out autonomous cars by the

fore the evaluation with the SCANeR Studio, e.g., the imple-

mid-2010s, 2013. Available online: http://www.

mentation of engine behavior and the algorithm redesign to

thecarconnection.com/news/1087636_toyota-will-

remove the driving prediction from the algorithm and adapt

roll-out-autonomous-cars-by-the-mid-2010s.

it accordingly.

[13] R. J. Rosen. Google’s self-driving cars: 300,000 miles

logged, not a single accident under computer control,

6.

ACKNOWLEDGMENTS

2012. Available online:

The work presented in this paper was in part funded by the

http://www.theatlantic.com/technology/archive/

NERVteh, raziskave in razvoj, d.o.o., which also provided

2012/08/googles-selfdriving-cars-300-000-

the SCANeR Studio simulation environment. In addition,

miles-logged-not-a-single-accident-under-

this work was also in part funded by the Slovenian Research

computer-control/260926/.

Agency under research project Z2-7581.

[14] Toyota. Lane Keeping Assist, 2014. Available online:

http://www.toyota-global.com/innovation/

7.

REFERENCES

safety_technology/safety_technology/

[1] Audi. Stay in lane, 2014. Available online:

technology_file/active/lka.html.

http://www.audi.co.uk/new-cars/a3/a3-

[15] Volkswagen. Lane assist, 2014. Available online:

sportback/driver-assistants/audi-active-lane-

http://www.volkswagen.co.uk/technology/

assist.html.

proximity-sensing/lane-assist.

[2] E. Dovgan, M. Javorski, T. Tušar, M. Gams, and

[16] Volvo. Volvo Car Group initiates world unique

B. Filipič. Comparing a multiobjective optimization

Swedish pilot project with self-driving cars on public

algorithm for discovering driving strategies with

roads, 2013. Available online:

humans. Expert Systems with Applications,

https://www.media.volvocars.com/global/en-

40:2687–2695, 2013.

gb/media/pressreleases/136182/volvo-car-group-

[3] E. Dovgan, M. Javorski, T. Tušar, M. Gams, and

initiates-world-unique-swedish-pilot-project-

B. Filipič. Discovering driving strategies with a

with-self-driving-cars-on-public-roads.

24





Ali nam metode strojnega učenja lahko pomagajo pri

načrtovanju novih visokoentropijskih zlitin?

Anton Gradišek

Andraž Kocjan

Miha Mlakar

Institut »Jožef Stefan«

Institut »Jožef Stefan«

Institut »Jožef Stefan«

Jamova 39

Jamova 39

Jamova 39

1000 Ljubljana, Slovenija

1000 Ljubljana, Slovenija

1000 Ljubljana, Slovenija

anton.gradisek@ijs.si

andraz.kocjan@ijs.si

miha.mlakar@ijs.si





POVZETEK

podobne atomske radije – zato pri zamenjavah ne nastopijo

Visokoentropijske zlitine (angleško: high entropy alloys, HEA) so

problemi zaradi različnih velikosti.

zanimiva skupina kovinskih materialov, ki so v zadnjem času

Zaradi specifične strukture imajo HEA vrsto obetavnih fizikalnih

pritegnile pozornost znanstvenikov zaradi lastnosti, zanimivih za

in kemijskih lastnosti, zaradi katerih so pritegnili zanimanje

uporabo v industriji. HEA so sestavljene iz petih ali več

inženirjev za uporabo v industriji. Gre za odlične mehanske

elementov, ki so med seboj naključno pomešani. Kljub mešanju

lastnosti (denimo visoka natezna trdnost), odpornost na korozijo,

pa nered ne pokvari globalne strukture, ki ostane pravilen kristal.

nizko obrabo, obstojnost pri visokih temperaturah itd. Poleg tega

Zaradi kompleksne zgradbe HEA težko obravnavamo s klasičnimi

je sinteza HEA relativno enostavna (običajno se pripravi talino

računskimi metodami znanosti materialov. V tem prispevku

ustrezne mešanice elementov in jo ohladi) in ne potrebuje

raziščemo, če lahko uporabimo metode strojnega učenja na zbirki

naprednih metod, kot je denimo rotacijsko kaljenje, ki se

že odkritih HEA, da bi lahko napovedovali strukturo sistemov z

uporablja za pripravo sistemov, ki so sicer obstojni v

novimi kemijskimi sestavami. Ugotovili smo, da binarni

temperaturnem območju, daleč od sobne temperature.[1]

klasifikator dobro določi posamezne strukture, s točnostjo tudi

nad 90 %, medtem ko bi za natančnejše delovanje klasifikatorja za

Specifična struktura in sestava HEA predstavlja določene

več razredov potrebovali večjo količino podatkov.

probleme pri usmerjenem načrtovanju strukture z metodami, ki so

sicer dobro poznane za enostavnejše kovinske sisteme. Pri

enostavnih sistemih namreč lahko definiramo osnovno celico

Ključne besede

kristala, s translacijami te celice pa nato lahko rekonstruiramo

Visokoentropijske zlitine, strojno učenje, kristalna struktura

celoten kristal. Za matematičen opis zato uporabimo periodične

funkcije, kar omogoči računsko določanje različnih lastnosti. V

1. UVOD

HEA tega pristopa ne moremo uporabiti. Čeprav je struktura

Visokoentropijske zlitine (HEA) so relativno nova skupina

enostavna (najpogosteje ploskovno ali prostorsko centrirana

kovinskih materialov. Prvič so jih v znanstveni literaturi

kocka), zaradi naključne razporeditve elementov ne moremo

sistematično opisali leta 2004, bolj intenzivne raziskave na tem

govoriti o pravi periodičnosti. Zato so zaželene nove metode, s

področju pa so se začele v zadnjem desetletju. HEA so zlitine

katerimi bi lahko dobili več informacij o HEA, ki bi nas pripeljale

najmanj 5 kovinskih elementov, pri tem, da je vsak od glavnih

do bolj načrtovanega načina sinteze novih sistemov.[2]

elementov zastopan vsaj med 5 in 35 atomskimi odstotki (lahko so

V pričujočem prispevku pogledamo, če namesto eksaktnih

prisotni tudi drugi elementi v manjših količinah).[1] Ti elementi

fizikalnih izračunov za napovedovanje strukture HEA lahko

imajo nizko entalpijo mešanja, zaradi česar v Gibbsovi enačbi

uporabimo bazo že znanih sistemov (v člankih [3,4] jih je

proste energije (ΔG = ΔH - TΔS, kjer je G Gibbsova energija, H

opisanih okrog 300) v povezavi z algoritmi strojnega učenja. Ti

entalpija, T temperatura in S entropija) pride do prevlade

algoritmi nam lahko odkrijejo povezave med posameznimi

entropijskega člena, kjer je entropija definirana z Bolzmannovim

parametri, ki jih sicer morda ne bi opazili, saj se pokažejo šele v

zakonom o konfiguracijski entropiji. Slednja je najvišja ob

visoko dimenzionalnem prostoru. Podobni pristopi se že

ekvimolarnem deležu vseh elementov in je tedaj odvisna le še od

uporabljajo na področjih, kot je medicina, kjer denimo

števila dodanih elementov N, ΔS = R lnN, kjer je R splošna

uporabljajo metode strojnega učenja za iskanje povezave med

plinska konstanta. Poenostavljeno bi lahko dejali, da entropijski

posameznimi geni in vrstami raka, ki bi ga ti geni lahko

člen v enačbi (ki predstavlja neurejenost sistema) v primeru HEA

povzročali.[5,6] Tu predstavimo prvo enostavno študijo, kjer nas

prevlada nad entalpijskim členom, ki bi sicer pripeljal do

zanima, kako dobro lahko s pomočjo algoritma napovemo

segregacije

posameznih

elementov

oziroma

do

tvorbe

strukturo HEA z znano sestavo.

intermetalnih faz z le dvema do tremi elementi in običajno

kompleksnejšo strukturo, kot jo imajo HEA zlitine sicer. Če

namreč pogledamo fazne diagrame za zlitine iz dveh ali treh

2. PODATKI IN METODA

elementov, vidimo, da v njih nastopa množica različnih faz in

Za bazo podatkov smo uporabili seznam HEA iz članka avtorjev

struktur, kar pomeni, da imajo take zlitine običajno kompleksno

Toda-Caraballo in Rivera-Diaz-del-Castillo [6], kjer je opisanih

mikrostrukturo. Posledično so običajno krhke in brez posebne

323 znanih HEA. Za vsakega od sistemov so navedeni naslednji

praktične uporabe. Po drugi strani pa ravno entropijski člen v

podatki:

HEA povzroči, da se atomi različnih elementov uredijo v relativno

majhno število enostavnih struktur. Pri tem je potrebno poudariti,

•

Družina oz. seznam elementov, ki v HEA nastopajo,

denimo AgAuCuPdSi.

da v HEA običajno nastopajo elementi, katerih atomi imajo

25

•

Kemijska sestava, ki pove tudi razmerje med elementi.

3. REZULTATI

Pri HEA CoCrFeNiCu so elementi zastopani v enakih

Klasifikacijskega problema se lahko lotimo na več načinov.

atomskih razmerjih, pri Ti0,5Co1,5CrFeNi1,5Mo0,5 pa ne.

Idealno bi bilo izdelati klasifikator, ki bi bil sposoben napovedati

katerokoli od struktur, ki nastopajo v naši bazi podatkov. Ker pa

•

Struktura. Najpogostejše strukture so BMG (bulk

metallic glass – kovinsko steklo), FCC (face-centred

imamo v več kot polovici od 42 razredov po le en ali dva primera,

cubic – ploskovno centrirana kocka), BCC (body-

se taka klasifikacijska metoda ne bo dobro obnesla. Če imamo

centred cubic – telesno centrirana kocka) ter

premalo učnih podatkov, se algoritem namreč dobro nauči

kombinacije FCC in BCC.

prepoznavati le obstoječe primere, ko dodamo nove primere, pa

odpove.

•

δ% - razlika v atomskih radijih

Klasifikacijski model zato najprej poskušamo zgraditi na petih

•

ΔH

razredih, torej na štirih najpogostejših strukturah in razredu

mix – mešalna entalpija (v kJ/mol)

Ostalo. Kot klasifikacijski algoritem se v tem primeru najbolje

•

Δχ% - razlika v elektronegativnosti

obnese Random Forest (naključni gozd), ki ima 62 %

klasifikacijsko točnost.

•

Ω, parameter, definiran kot Ω=Tm ΔSmix/| ΔHmix |, kjer je

ΔSmix entropija mešanja in Tm temperatura tališča

Bistveno boljše rezultate dobimo, če zgradimo binarne

klasifikatorje. V en razred postavimo primere z določeno

•

μ, parameter, definiran kot μ= Tm/TSC, kjer je TSC

strukturo, drug razred pa predstavljajo vsi ostali primeri. Tudi

temperatura spinodalne dekompozicije

tokrat se najbolje obnese metoda Random Forest. Klasifikacijske

točnosti za binarne klasifikatorje s to metodo so predstavljene v

•

VEC – koncentracija valenčnih elektronov (ang.=

valence electron concentration)

Tabeli 2.



•

e/a – število valenčnih elektronov na število atomov v

Tabela 22. Klasifikacijske točnosti binarnega klasifikatorja

osnovni celici

Izbrana struktura vs. Ostale strukture z metodo Random

•

s

Forest

m% – neujemanje medatomskih razdalj

•

K

Struktura

Točnost (%)

m% - neujemanje parametrov mreže

Večino zgornjih parametrov lahko izračunamo iz atomske sestave

BMG

91

in parametrov za posamezne atome.[6] Nas je zanimalo, če lahko

FCC+BCC

87

na podlagi opisanih parametrov določimo fazno sestavo ter

njihovo kristalno strukturo. Število HEA s posamezno strukturo je

BCC

77

predstavljeno v Tabeli 1.

FCC

92





Tabela 1. Število HEA s posamezno strukturo v bazi podatkov

Vidimo, da so klasifikacijske točnosti binarnega klasifikatorja

Struktura

N

veliko višje od tistega, ki deluje na petih razredih. V primeru

večkratnega klasifikatorja imamo namreč v vsaki od množic

BMG

76

majhno število elementov, pri binarnem klasifikatorju pa nekaj od

FCC+BCC

74

teh množic združimo in tako povečamo število elementov.

BCC

31

Metoda Random Forest deluje tako, da na naključnih

podmnožicah učne množice zgradi odločitvena drevesa, nato pa s

FCC

31

pomočjo teh dreves klasificira nov primer – končni razred je tisti,

ki ga napove največ dreves. Čeprav je ta metoda najbolj uspešna

Ostalo*

111

pri klasifikaciji, ni intuitivno razumljiva za uporabnika. Podatke

* V razredu Ostalo se nahajajo vse preostale strukture, ki so

lahko namesto tega predstavimo s posameznim odločitvenim

zastopane z manj kot 30 primeri. Gre za intermetalike (IM),

drevesom (klasifikacijski algoritem J48), kjer v vsakem listu

heksagonalno celico (HCP), Lavesovo fazo (L), kombinacije

drevesa naredimo odločitev glede na določen parameter. Primer

zgoraj navedenih in drugo. Skupaj imamo 42 možnih struktur,

takega odločitvenega drevesa za binarni klasifikator BMG vs.

vendar pa je večina le-teh zastopana z le po enim ali dvema

Ostale strukture je prikazan na Sliki 1. Pod spremenljivko Value

primeroma.

prva vrednost označuje število ostalih struktur, druga pa število



BMG. Klasifikacijska točnost tega drevesa je 89 %. Vidimo, da je

najpomembnejši parameter δ%, sledita mu ΔH

Napovedovanje

strukture

HEA

lahko

obravnavamo

kot

mix in μ.

klasifikacijski problem, ki se ga lotimo z algoritmi strojnega

učenja. Pri tem strukturo obravnavamo kot razred, devet

4. DISKUSIJA

parametrov (δ%, ΔH

Kot pokaže naša enostavna študija, je uporaba metod strojnega

mix, Δχ%, Ω, μ, VEC, e/a, sm% in Km%) pa

kot atribute. Preizkusili smo več algoritmov, ki so vgrajeni v

učenja

lahko

koristna

pri

napovedovanju

strukture

program Weka.[7] Vsakič smo klasifikacijsko točnost preverili z

visokoentropijskih zlitin. V znanosti materialov je še posebej

metodo navzkrižne validacije, pri kateri začetno množico

zanimivo iskanje prehodov med posameznimi strukturami. Naš

podatkov razdelimo na N podmnožic, nato zgradimo skupen

pristop nam omogoča izdelavo enostavnega simulatorja strukture.

klasifikacijski model na N-1 podmnožicah in ga testiramo na

Če nas denimo zanima prehod med BMG in BCC, ko

preostali. To ponovimo N-krat.

spreminjamo razmerje dveh od petih elementov v zlitini, nam tak

26





simulator lahko pove, okrog kakšne sestave lahko pričakujemo

prehod med strukturama. Na ta način lahko k sintezi novih

sistemov pristopimo bolj sistematično in ne le na podlagi

poskušanja. Naše modele lahko izboljšamo z večjo količino

podatkov v učni množici, preučiti pa velja tudi, ali lahko uvedemo

še dodatne atribute, ki bi v model vnesli nove informacije.



Slika 1: Odločitveno drevo za binarni klasifikator BMG vs. Ostale strukture.



5. REFERENCE

[1] Tsai, M. H., Yeh, J. W., High-Entropy Alloys: A Critical

Review, Materials Research Letters, 2014, 2, 107-123.

[2] Murty, B.S., Yeh, J.W., Ranganathan, S. High-Entropy

Alloys, Butterworth-Heinemann, Elsevier, 2014, p. 57.

[3] Poletti, M. G., Battezzati, L. Electronic and thermodynamic

criteria for the occurrence of high entropy alloys in metallic

systems, Acta Materialia, 2014, 75, 297-306.

[4] Toda-Caraballo, I, Rivera-Diaz-del-Castillo, P. E. J., A

criterion for the formation of high entropy alloys based on

lattice distortion, Internetallics, 2016, 71, 76-87.

[5] Wang Y. et al., Gene selection from microarray data for

cancer classification—a machine learning approach,

Computational Biology and Chemistry, 2005, 29, 37-46

[6] Guyon I. et al., Gene Selection for Cancer Classification

using Support Vector Machines, Machine Learning, 2002,

46, 389-422.

[7] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann,

P., Witten I. H. The WEKA Data Mining Software: An

Update; SIGKDD Explorations, 2009, 11, 1.

27





Markov chain model for energy-efficient context

recognition

Vito Janko

Mitja Luštrek

Jožef Stefan Institute

Jožef Stefan Institute

Jožef Stefan International Postgraduate School

Jožef Stefan International Postgraduate School

Jamova 39, 1000 Ljubljana, Slovenia

Jamova 39, 1000 Ljubljana, Slovenia

vito.janko@ijs.si

mitja.lustrek@ijs.si

ABSTRACT

sound level, or some higher level activities such as ”shop-

Continuous sensing of the user and subsequent context recog-

ping”, ”traveling” or ”working”.

nition using wearable sensors is a popular area of research.

One of the big problems of such automatic context recogni-

A major limitation of such continuous sensing and context

tion is the battery life of the sensing device. To maximize

recognition is its heavy toll on the sensing device’s battery

the energy efficiency, the context recognition system should

life. This is especially relevant for smartphones, which have

adapt its settings to the current situation. Choosing the ap-

a very limited battery that must be shared between many

propriate setting for each situation usually requires either a

applications, but the same limitation applies to basically

lot of expert knowledge or extensive experimentation. We

any wearable device. This issue is often neglected when dis-

propose a method that simulates all possible combinations

cussing the design of context-recognition systems, however

of contexts and settings using a Markov chain model, au-

it is an essential component if such systems are to be used

tomating and speeding up the whole process. We show on

in practice.

a small example that the simulation is accurate, and that it

allows us to quickly select best trade-off between the energy

Some solutions deal with this issue by optimizing the sam-

efficiency and context-recognition accuracy.

pling rate or sensor duty cycle for the particular recognition

task [1].

This alone however, might be suboptimal.

We

Categories and Subject Descriptors

might for example want to have the GPS active at a high

sampling rate while the user is driving, but at a very low

G.3 [Probability and statistics]: Markov processes; H.2.8

rate when he is working in the office. This calls for dynamic

[Database Applications]: Data mining

changes in the settings for both the sensors and for the sub-

sequent processing. Many adaptive approaches already exist

Keywords

[3][4][5][6].

Context recognition; continuous sensing; energy efficiency;

Markov chain; acitivity recognition.

A problem with these and similar solutions is that they have

complex pipelines and/or require many parameters specific

1.

INTRODUCTION

to the particular recognition problem. If we were to recog-

Widespread accessibility of wearable sensing devices allows

nize different activities using different sensors we would have

for many possibilities for tracking the users who wear them.

to adapt these parameters, requiring either a lot of experi-

Possible applications range from measuring their exercise

mentation or expert knowledge. It would therefore be useful

patterns and checking on their health, to giving them location-

to have a method that can be provided with a sensor-rich

specific recommendations. The recognition of user’s context

dataset, and it would be able to tell which sensors or which

using sensors is a popular and mature area of research [2].

sensor setting to use in each context.

For example, using activity recognition, we can recognize

”walking, ”running”, ”resting” and similar activities from

A relatively simple solution of the above problem can be

accelerometer data. This task was made easier and more

found in the work of Yan, et al. [6]. They select the sensor

practical with the increased use of smartphones, which have

and attribute settings based on the last classified activity.

many sensors built in and are often carried. Sensing with

For each activity they find the setting that best recognizes

multiple sensors, possibly at once, opens additional options

it, by testing all the settings on their dataset, and then use

for context recognition: detecting one’s location, ambient

it when that activity is detected. However, since the selec-

tion is made for each activity in isolation, the effect on the

whole system can be unpredictable. To illustrate: an ac-

celerometer is very good at recognizing walking and resting,

while a GPS is very good at recognizing driving. However, if

we have only the accelerometer active while walking, driving

will never be detected and the sensor switch will never oc-

cur. To take such interactions into account, we would have

to run an experiment that has a specific setting for each

activity and switches between them in runtime. Since there

28

are many (activity, setting) combinations, this process might

S(c1, s1) - the Markov state with context c and setting s.

be prohibitively time consuming, possibly taking months on

T (a, b) - the probability in the dataset that the next context

large datasets with many possible settings.

will be b given that the current one is a.

C(s) - the set of all contexts that use setting s.

We propose a Markov chain model to simulate runtime set-

Cs(c1|c2) - the probability that the classifier that works with

tings switching and predict what would happen if we used

setting s will classify an instance to c1, if the true context is

a particular setting for a particular context.

Using this

c2.

model requires only a small amount of actual experimenta-

tion. Since the experimentation is a lot more time consum-

ing than the simulation, many more combinations can be

Having all transition probabilities, we can use the basic

tried out and a better combination is therefore expected to

Markov chain calculus to calculate the steady state of the

be found with the same resources. In this paper we explain

Markov chain. This gives us the amount of time the system

the proposed Markov chain and try it on a simple activity

will be in each of the states. Since we know how much time

recognition dataset.

any setting is active and how much energy this setting con-

sumes per time unit, the energy consumption of the whole

2.

METHOD DESCRIPTION

system can be estimated. Additionally, since confusion ma-

Suppose we have a sensing system that works as follows:

trices give us the accuracy for each state, we can calculate

the user is in one of predefined contexts (for example, the

the accuracy of the whole system.

context can be the current activity) and each context is as-

sociated with some setting. Settings can be which sensors

are in use, or what sampling frequency or duty-cycle policy

X

Energy estimation =

t(m)e(s(m))

is active, or which feature set is used for classification, and

m∈M

so on. Using the current setting, we watch for a possible

context change, and if it happens, we switch the sensor set-

ting to the one assigned to the new context. For example: if

M - the set of all states in the Markov chain.

we are sitting we need a lower sensor frequency, compared to

t(m) - the predicted proportion of time spent in state m.

when we are walking. To determine the optimal parameters

e(s) - the energy requirement of a setting s in a given time

for such a system, we might want to try every combination

unit.

of assignments of reasonable settings to possible contexts. If

s(m) - the setting corresponding to state m.

we have c contexts and s settings, we would have sc different

combinations. Our goal is to make only s experiments and

simulate the rest, gaining a drastic increase in efficiency.

X

Accuracy estimation =

t(m)acc(c(m), s(m))

We begin by selecting all reasonable settings. For each of

m∈M

them, we make an experiment where the classification model

is trained and tested with this particular setting. For each

experiment, we calculate and remember the confusion ma-

acc(c, s) - the accuracy of the classifier that works with set-

trix. Additionally we need the transition probability from

ting s, if the true context is c.

one state to the next one; that can easily be inferred from the

c(m) - the context corresponding to state m.

dataset. Finally we must make energy consumption estima-

tion for each setting. Energy consumptions for most sensors

at different configurations are known or can be estimated

It should be noted that many other metrics can be deter-

with simple measurements.

mined from such a model. Example: the accuracy for a par-

ticular activity or the latency of activity change detection.

To simulate an experiment where the settings are dynami-

They can be used instead of the accuracy when evaluating

cally switched depending on the current activity, we create

the performance.

a Markov chain. This model has a state for each possible

(context, setting) combination. The Markov state (c,s) rep-

Every simulated experiment represents a possible trade-off

resents that we are currently in context c, with the system

between the accuracy and energy consumption. Simulating

having setting s (which depends on which context the sys-

all the combinations, the Pareto optimal trade-offs can than

tem believes we are currently in).

be presented to the application designer, which - knowing

the energy and accuracy requirements of the application -

Next we have to calculate the transition probability from

can then choose the ideal settings for it.

one Markov state to another. They can be calculated from

the transition probabilities of contexts and data from the

3.

EXPERIMENTAL SETUP

previous computed confusion matrices. Intuitively: we get

We will demonstrate our method on a simple example. We

in a state (c,s) if the context really changes to c and if the

have a dataset of accelerometer data generated by a smart-

system classifies this instance into one of the contexts that

phone, and we want to classify basic activities (Table 1) from

have assigned setting s.

it. We can do this by using two different settings: a high

frequency sampling (50 Hz) and a low frequency sampling (2

Hz). Two extremes were chosen for simplicity. In practice

X

S(c

we would perhaps also try other frequencies (10 Hz, 20 Hz

1, s1) → S(c2, s2) = T (c1, c2)

Cs (c|c

1

1)

c∈C(s

etc...) and duty cycles. We assign one of the two frequencies

2 )

29



to each activity, generating 16 different combinations. All

16 combinations will be simulated by doing only 2 actual

experiments. In this case we might expect that the best

combination would be using the low frequency for the activ-

ity rest, and the high frequency for the other activities, but

the trade-offs are not so clear in general.

The confusion matrices are in Table 2 and Table 3. States

of the generated Markov chain are in Figure 1. Note that

such a chain is in principle fully connected, including the

connections of each state to itself.

s

w

r

c

62.7

21.1

10.4

5.8

Table 1: Proportion (in %) of each activity in the

dataset. The values are in %. s - rest, w - walking,

r - running, c - cycling

Figure 1: Markov chain states for our example. Ver-

s

w

r

c

tical axis signifies the true activity, while the hori-

s

97.6

1.4

0.0

1.0

zontal signifies the setting, which depends on the

w

5.6

86.2

5.5

2.7

last classified activity.

r

0.7

10.0

89.1

0.2

c

22.9

16.4

0.0

60.7

Table 2: The confusion matrix using the high fre-

We also plotted the Pareto front that shows the sensible

quency. The values are in %.

solutions. We see that the Markov chain simulation points

very closely correspond to the non-simulated ones. The sim-

ple simulation captures the general trend of the Pareto front,

s

w

r

c

but makes substantial mistakes in predicting the actual val-

s

97.2

2.3

0.1

0.4

ues. If the interaction between sensors and activities were

w

9.5

82.4

7.2

0.9

more complex, we expect the error to be be even greater.

r

3.9

28.4

67.7

0.0

The error is numerically evaluated in Table 4.

c

46.8

32.6

1.5

19.1

acc.

energy

Table 3: The confusion matrix using low frequency.

Markov

2.03

0.35

The values are in %. We can observe that the accu-

Simple

12.82

1.83

racy for rest does not change much compared to the

high frequency case, while the accuracy loss when

Table 4: The average prediction mistake, made on

cycling is quite drastic.

energy consumption and classification error for the

simple and Markov chain model.

The values are

given in % with respect to the maximal value for

We also did a simpler simulation in the spirit of Yan, et.

the corresponding axis.

al [6], where every activity was considered in isolation. In

this case, the accuracy of the system was computed simply

by computing the accuracy for each activity given its set-

We also explored what happens if the underlying activity

ting and then weighted by this activity’s proportion in the

distribution in the dataset changes. This can be easily simu-

dataset.

lated by modifying the transition probabilities of the Markov

chains. It turns out that while the values themselves change

4.

RESULTS

drastically, the overall shape of the Pareto front remains sim-

ilar. This means that the best simulated trade-off likely re-

The method was evaluated in the following way. First all

mains the best with most other activity distributions. Such

the simulation trade-offs were computed (both Markov chain

simulation is also handy to see if the energy requirements ex-

simulation and the simple one). Then we ran the actual

ceed the application limits if some condition changes. Note

experiments we were simulating, switching classifiers and

that no additional experiments with the actual data were

sampling frequencies during the runtime. All three sets of

needed to generate this information, which is an additional

results were plotted in Figure 2.

benefit of our method.

The trade-offs in Figure 2 are marked with letters that cor-

respond to the activities where the low frequency was used.

The case where only rest was used with the low frequency

(marked as ’s’) could be considered the best trade-off be-

tween the energy gained compared to the accuracy lost.

30



6.

REFERENCES

[1] Aftab Khan, Nils Hammerla, Sebastian Mellor, and

Thomas Plötz. 2016. Optimising sampling rates for

accelerometer-based human activity recognition.

Pattern Recognition Letters (2016).

[2] Seon-Woo Lee and Kenji Mase. 2002. Activity and

location recognition using wearable sensors. IEEE

pervasive computing 1, 3 (2002), 24–32.

[3] Hong Lu, Jun Yang, Zhigang Liu, Nicholas D Lane,

Tanzeem Choudhury, and Andrew T Campbell. 2010.

The Jigsaw continuous sensing engine for mobile

phone applications. In Proceedings of the 8th ACM

conference on embedded networked sensor systems.

ACM, 71–84.

[4] Jeongyeup Paek, Joongheon Kim, and Ramesh

Govindan. 2010. Energy-efficient rate-adaptive

GPS-based positioning for smartphones. In

Proceedings of the 8th international conference on

Mobile systems, applications, and services. ACM,

299–314.

[5] Yi Wang, Jialiu Lin, Murali Annavaram, Quinn A

Jacobson, Jason Hong, Bhaskar Krishnamachari, and

Norman Sadeh. 2009. A framework of energy efficient

Figure 2:

Black points are real trade-offs, blue

mobile sensing for automatic user state recognition. In

points are simulated with Markov chains, red points

Proceedings of the 7th international conference on

are simulated with the simple model. The Pareto

Mobile systems, applications, and services. ACM,

fronts are drawn using corresponding colors.

The

179–192.

lower left corner represents the point with the low-

[6] Zhixian Yan, Vigneshwaran Subbaraju, Dipanjan

est error and the lowest energy consumption. We

Chakraborty, Archan Misra, and Karl Aberer. 2012.

can see that Markov chain simulation corresponds

Energy-efficient continuous activity recognition on

very closely to the values of real experiments.

mobile phones: An activity-adaptive approach. In

Wearable Computers (ISWC), 2012 16th International

Symposium on. Ieee, 17–24.

5.

CONCLUSION AND FUTURE WORK

The simulations display a very high decree of fidelity to the

actual experiments and are very fast. Our method can thus

effectively tackle the important but difficult task of selecting

system settings that give us a good compromise between the

accuracy and energy efficiency.

Future work on the topic will include testing the proposed

method on a more complex dataset that contains more sen-

sors and activities to see if the results still have the same

fidelity. Another improvement will be to explore options for

searching the settings combination space more efficiently.

Since this is essentially a multi-objective optimization prob-

lem, many approaches from that area can be then used to

further increase the effectiveness of our method.

31





Reliability estimation of individual predictions for

incremental models

Anže Javornik

Igor Kononenko

Darko Pevec

univ. dipl. inž. rač. in inf.

Ph. D., Professor

PhD.

Dragomelj 150

Faculty of Computer and Inf. Sc.

Faculty of Computer and Inf. Sc.

1230 Domžale

Večna pot 113,1000 Ljubljana

Večna pot 113,1000 Ljubljana

+386 41 550 515

+386 1 479 8230

+386 1 479 8230

anze.javornik@pointar.si

xaigor@fri.uni-lj.si

darko.pevec@gmail.com





ABSTRACT

Since the knowledge is static and based on the data that was

Reliability estimate of an individual prediction can represent a very

gathered from the sensor when it was still in a good operating form,

important information, especially if a wrong decision can have

the accuracy of the model and reliability estimate will drop. To

serious consequences. Same as classical prediction models,

battle such issues an incremental approach is needed.

classical methods for reliability estimation are based on known

examples – the learning set. Such methods cannot be used with

2. INCREMENTAL LEARNING

incremental models, since this knowledge is not available. In this

Classical learning requires that enough data is available to construct

article we present three known methods of reliability estimation

knowledge. Incremental learning has no such requirements. In the

and propose their incremental versions. We tested incremental and

process of incremental learning each new marked example is used

classical methods using four incremental models on eight data

to refine the current model.

streams with and without drift. Obtained results were statistically

analyzed and compared.

2.1 Incremental Naive Bayes Classifier

Naive Bayes classifier makes prediction with an assumption that

Categories and Subject Descriptors

different attribute values with given class are conditionally

I.2.m [Artificial intelligence]: Miscellaneous

independent. The classification formula is based on Bayes rule:

| )

General Terms

| ) =

)



)

Algorithms, Reliability.

The task of learning process is to use the learning set and

Keywords

approximate probabilities on the right hand-side of equation. The



knowledge of naive Bayes classifier is a table of approximations of

Incremental, Machine learning, Classification, Reliability estimate

prior probabilities of classes P(rk) and a table of conditional

probabilities P(rk | vi) of classes rk with value vi for attribute Ai.

1. INTRODUCTION

In incremental learning the algorithm needs to refine prior

Task of machine learning is to construct a model, which is used for

probabilities and conditional probabilities. In [2] the adjustment

forecasting. The learning process uses known examples to adapt the

formula for prior probabilities is defined as

prediction model in such a way that it is able to solve problems

1

better. The result of learning process is knowledge which can be a

set of remembered examples, an algorithm or a set of rules which

′ ) =

+ 1

) + 1 + ; =

describe the problem domain [1]. Such knowledge is called a

model, which can be used for understanding the problem better or

+ 1

)

;

≠

for predicting new examples.

where

The process of forecasting is closely linked to the concept of

= | | + | |

reliability. In situations when wrong prediction can have serious

and for conditional probabilities

consequences, we would like to know how accurate the prediction



1

is. However, for an unmarked example we cannot provide an exact

accuracy but can only estimate it.

1 +

, ) + 1 + ; = !" =

, ) =





Classical prediction models and methods of reliability estimation

1 +

, )

;

= !" ≠

are based on examples from a problem domain and so their

, )

; ≠

knowledge is static. Often the problem domain changes over time.

where

Consider the sensor readings whose sensitivity is decreasing.

= | | + #$%"& ')



where |Ai| represents the number of values of attribute Ai and



count(rj) represents the number of examples whose class is rk.



32



2.2 Incremental Decision Trees

that it forgets previously seen examples and is thus sensitive to

Decision tree is composed of internal nodes that correspond to

example distribution.

attributes, branches that correspond to subsets of attribute values

The space adjustment strategy which we developed in our work

and leaf’s that correspond to classes. A path from root to a leaf

differs from the sliding window in a sense that it does not forget

corresponds to one decision rule. The task of the learning process

previously seen examples. Instead of tracking the space of specific

is to build a tree from the learning set in such a way, that for each

examples (xi, Ci) we track the space of examples’ description

current level it finds a most important attribute – one who splits the

(x ’

’

i , Ci ). Let

example set best – and use it as the root of current subtree.

3 = 4 5 , 6 , … , 58, 68 9

Incremental algorithms are refining the decision tree for each new

be an example space where each example x is defined with a class

example. The goal of the algorithm for incremental learning of

label C and a set of attributes

decision tree is to construct the same decision tree as non-

incremental algorithm on the same set of examples. An example of

5 4: , … , :89

incremental learning algorithm is ID5R, which for each new

A description of example space E is defined as (x ’

’

E , CE ) where

example finds most important attribute Ai and lifts it towards the

5<

4:< , … , :< , 6<9

root of the tree and refines the tree [3].

;

8



:< ⨍ 5 . : , … , 5

2.3 Incremental Bagging and Boosting

8. :

<

Traditional algorithms for supervised learning give their result

6; ⨍ 5 .6 , … ,58. 6

according to the individual basic prediction model. Aggregated

⨍ :, ?

: @ ∗ ? / :

learning algorithms combine predictions of many such base models

Let

where each of them is learnt in traditional way. Bagging and

boosting are well known aggregate algorithms for which it was

ℙ 4 5<, 6< , … , 5< , 6< 9

shown that they provide better results compared to individual

be a set of descriptions where 5< is a description and 6< the class

models [4].

of description. For each new example 5, 6 we need to decide

Bagging is a method in which M different learning sets are

either we join this example into existing description or start

generated using randomly selected examples from original learning

building a new description with it. We based this decision on ratio

set which can be repeated. Each generated learning set contains

between the closest and the furthest description with the same class

each example K-times, where

label. If the ratio is smaller than some theta value we join it with

1

1 01

the closest description of the same label, else we start building a

( = )) = *+),-

new description. In case of building a new description, we also

+. -1 / +.



implemented the logic to reduce space size. If the number of total

which is a binominal distribution. With N → ∞ distribution of K

remembered descriptions exceeded the allowed amount, then in the

approximates to Poisson(1) distribution.

whole space we find the closest two descriptions with the same

This feature is utilized by incremental bagging. For each new

class label and join them into one. Using the mentioned method on

example for each individual base model M the algorithm chooses

first hundred examples of SEA generator [6] the space contained

this example K ~ Poisson(1) times and refines the model M.

only fifty-nine descriptions. On the left side of Figure 1 we see how

the space would look like if we would remember every example

Boosting is a more complex process of base model set generation.

and on the right side is the space using the space adjustment

Each base model M1, …, Mn is learnt with weighted learning set.

strategy.

The weight is defined by the classification error of previous base

prediction models.

Incremental learning for boosting uses Poisson sampling to

approximate aggravating. The algorithm works as the algorithm for

bagging with a difference that when a base model wrongly

classifies an example the Poisson distribution parameter, bound to

the example, is incremented for the next base model, else it is



decreased [5].

Figure 1. Distribution of examples in internal space

In our work we took three reliability scores (CNK, LCV, BAGV)

3. RELIABILITY ESTIMATION

from [5] and modified them by adding the incremental nature.

Since we do not know the actual result of an example we can only

speculate about the prediction accuracy. The speculation is based

3.1 Incremental Local Error Score

on previously seen examples. In most cases we focus only on space

Classical local error score is defined as follows: let K be the

near the current example – that is closest neighbors.

models’ prediction of an unmarked example 5, / and +

4 5 , 6 , … , 5 , 6 9

With incremental learning the space of known examples becomes

the set of closest neighbors, where Ci is

unlimited which poses a problem since we do not have unlimited

correct class label of the i-th example. The reliability estimate

space nor the time to find the closest neighbors. In our work we

(CNK) is then the average distance between the class label of the

used two different strategies to deal with space of known examples:

closest neighbors and the models prediction

sliding window and space adjustment.

∑ ƒ 6 , (

6+(

The sliding window is a simple strategy. In this strategy we

)



remember the new examples and forget old examples, always

Incremental version (iCNK) for the example space management

keeping a specific number in memory. Downside of this strategy is

uses previously described strategy, the space adjustment. Instead of

33

using actual examples iCNK operates on closest descriptions

2.

for 5 , 6 in ℙ

+ 4 5<, 6< , … , 5< , 6< 9, and is defined as

3.

construct model U

"VW ℳ on ℙ ∖ 5 , 6

∑ 	ƒ 6<, (

!6+(

4.

use model U and get prediction ( for 5

)



5.

calculate distance ƒ 6 , (

where 6< is the class label of the i-th description (Algorithm 1).

6.

end for



7.

calculate score !R6

∑ ƒ FG,HG

Algorithm 1. iCNK

Definitions:

N – max. number of descriptions

8.

refine ℙ with (x, K)





E – theta

9.

if |ℙH| > +





@ – adjustment rate

10.

find descriptors , ' 	 ∈ 	ℙ , for which





ℙ – description set

min ƒ

,

,'

' and . 6

'. 6





k – num. of relevant closest neighbors

11.

in ℙ replace , ' with

'

Input:



example with prediction (x, K)

12.

end if

Output:

reliability estimate

13.

return score

1.

find set of k closest neighbors ℙ



2.

I

calculate score !6+( @, +

∑GJK	ƒ FG,H

L



3.3 Incremental Variance of Bagging Model

3.

refine ℙ with (x, K)

The aggregated prediction model, which is composed of m

4.

if |ℙ| > +

prediction models, provides its prediction as the average of

individual predictions:

5.

find descriptors , ' 	 ∈ 	ℙ , for which

Y

min ƒ

,

(

,'

' and . 6

'. 6

( 	S

6.

in ℙ replace , ' with

'

7.

The reliability estimate BAGV is defined as the variance of

end if

prediction distributions:

8.

return score

1 Y



Z [

S ( / ( \

3.2 Local Cross-Validation Score

The incremental version for space management uses a version of

The main idea behind LCV score is that it is independent from the

“sliding window” strategy. We define iBAGV on space

models’ prediction. From the set of closest neighbors +

ℳ 	]U , … , U

4 5 , 6 , … , 5 , 6 9

8, … , U8^_à	 ,b∗8 c, where Mi is an incremental

it builds k models using cross validation

model, d the rate of forgetting and n a predefined number of

“leave one out” and uses the model to predict the example left out.

models. The position (order) of a model M in ℳ is also important.

The score is then defined as

In a sense we have converted a set of aggregated models in BAGV

ƒ 6 , (

algorithm to a list and extended it to include additional "

R6

S

max	 1, d ∗ "

)

models. The models under indexes 1 to n are then

used for calculating the score, while models under indexes n + 1 to

Incremental version of the algorithm (iLCV) uses the adjustment

the end are used for incremental learning. When it is time to

space strategy for managing examples and is thus operating on

implement the forgetting feature, we take first ∗ d models from

+ 4 5<, 6< , … , 5< , 6< 9 (see Algorithm 2)

start of the list, re-initialize them and push them to the end of the

list, forcing previous models towards start (see Algorithm 3).

Algorithm 2. iLCV

Algorithm 3. iBAGV

Definitions:

N – max. number of descriptions

Definitions:

ℳ– prediction model list





E – theta





g – adjustment rate





@ – adjustment rate





d – forgetting rate





ℙ – description set





n –number of models





k – num. of relevant closest neighbors





bagInit – is bag already initialized





ℳ– base incremental model





count – counter for applying forgetting

Input:



example with prediction (x, K)

Input:



example with prediction (x, K)

Output:

reliability estimate

Output:

reliability estimate

1.

find set of k closest neighbors ℙ

34

1.

define relevant model set ℳh

4U 9, where U ∈ ℳand

Results show that incremental scores and classical scores had about

! ≤ "

the same percentage of statistically significant positive correlation.

The estimate iBAGV also shows a good percentage of statistically

2.

for U in ℳh

significant negative correlation. However, estimates iCNK and

3.

calculate prediction ( with U for 5

iBAGV in most tests provided a higher correlation coefficient. The

estimate iLCV performed about as well as its classical version.

4.

end for

5.

calculate score !Z [

∑Y ( / ( \

100%

Y



6.

define learning set ℳjk l8

m

ℳh, "$&	?:no"!&

ℳ/ℳh, ?:no"!&



50%

7.

randomly choose ℳq

rℳjk l8r ∗ 1 	g models

0

with repeat from ℳjk l8

8.

use (x, K) to learn models in ℳq

-50%

9.

increase count by 1

10. if #$%"& ≥ "

-100%



11. replace first " ∗ 	d models from ℳ with new and push

Figure 3. Results on generators with drift

them into the end of list ℳ

Results on generators with drift are shown in Figure 3. Additional

12. set #$%"& = 0 and bagInit = True

gray columns represent the change compared to generators without

13. end if

drift. We can notice a good drop in statistically significant positive

correlations in classical estimates CNK and BAGV and the increase

14. return score

in statistically significant negative correlations in the incremental



estimate iLCV. Incremental scores show a much lower drop in

percentage of statistically significant positive correlations. The

4. TESTS AND RESULTS

same as in tests on generators without drift, incremental scores

We tested the developed algorithms and their classical versions on

iCNK and iBAGV in most tests, where both correlation coefficients

4 incremental models (Naive Bayes, Hoeffding Tree, OzaBag and

are statistically significant, provide higher correlation coefficients

OzaBoost) from [7] using 8 generators from [7] with and without

than their classical versions. Also we can notice that unlike in test

drift. We test the calculated scores against actual error by

with no drift, all incremental scores have a higher percentage of

calculating the Person’s correlation coefficient which we

statistically significant positive correlation.

substantiated with statistical significance.

5. REFERENCES

[1] I. Kononenko, M. Kukar. Machine Learning and Data

Mining. Horwood (2007).

100%

[2] S. Ren, Y. Lian, X. Zou, Incremental Naive Bayesian

80%

Learning Algorithm based on Classification Contribution

60%

Degree, Journal of computers, vol. 9, no. 8, August 2014:

1967 - 1974.

40%

20%

[3] Utgoff, P. E. (1989). Incremental induction of decision trees.

0

Machine Learning, 4, 161-186.

-20%

[4] N. Oza, S. Russell, Online bagging and boosting, In Artificial

-40%

Intelligence and Statistics 2001, 105–112, Morgan



Kaufmann (2001).

Figure 2. Results on generators without drift

[5] Z. Bosnić, Ocenjevanje zanesljivosti posameznih napovedi z

The results of tests on generators without drift are presented in

analizo občutljivosti regresijskih modelov, doktorska

Figure 2. On the left side filled columns represent the percentage of

disertacija, Fakulteta za računalništvo in informatiko,

tests where the reliability score had a positive correlation

Ljubljana (2007).

coefficient and was statistically significant and empty columns

[6] W. Nick Street, YongSeog Kim, A streaming ensemble

represent the percentage of tests where the score had a negative

algorithm (SEA) for large-scale classification, KDD ’01:

correlation coefficient and was statistically significant. The right

Proceedings of the seventh ACM SIGKDD international

side shows the comparison between incremental and classical

conference on Knowledge discovery and data mining, 377-

versions of reliability scores. The filled column represents the

382 (2001).

percentage of tests where both scores had a statistically significant

correlation coefficient and incremental version had a higher value

[7] Albert Bifet, Geoff Holmes, Richard Kirkby, Bernhard

of it, while the empty columns show where higher value of

Pfahringer (2010); MOA: Massive Online Analysis; Journal

correlation coefficient was held by the classical version.

of Machine Learning Research 11: 1601-160.



35





Dve nadgradnji algoritma vzvratnega razširjanja

Bojan Ploj



Martin Terbuc

School Centre Ptuj

School Centre Ptuj

Volkmerjeva cesta 19, 2250 Ptuj

Volkmerjeva cesta 19, 2250 Ptuj

Slovenia, Europe

Slovenia, Europe

+386 2 787 1 800

+386 2 787 1 812

bojan.ploj@scptuj.si

martin.terbuc@scptuj.si





ABSTRACT

val napredka je prinesel algoritem vzvratnega razširjanja [3]

In this article two new algorithms for learning Feed-Forward

(angleško backpropagation), ki omogoča učenje večslojnih

Artificial Neural Network are presented. In the introduction, a

nevronskih mrež s povezavami naprej (FFANN) na osnovi

brief description of the development of the existing algorithms

gradientnega zmanjševanja. Najbolj znana med FFANN je

and their flaws are shown. The second part describes the first new

večslojni perceptron (MLP), ki lahko rešuje tudi najbolj zahtevne

algorithm - Bipropagation. Basic idea is given first, followed by a

probleme, če ima za to na voljo dovolj veliko število nevronov in

detailed description of the algorithm. In the third part yet another

slojev [4]. V praksi se MLP izkaže kot razmeroma zmogljiv

new algorithm is given, called Border Pairs Method. Again is first

inteligentni sistem, njegov učni algoritem pa ima žal številne

given a basic idea and then follows a detailed description of the

pomanjkljivosti, zaradi česar se učenje pogosto konča neuspešno.

algorithm. In the fourth part the results and findings of

Kadar so učni podatki nekoliko obsežnejši, se učni čas zelo

experimental work are presented. In the conclusion, it is found

podaljša, hkrati pa se za nameček še zmanjša uspešnost učenja.

that two described algorithms are fast and reliable - the second

Dodatna težava je še dejstvo, da ne poznamo optimalne oblike

one is also constructive.

FFANN (število slojev in nevronov v posameznem sloju), kar je

dokaj pomembno za dober učni rezultat. Premajhna FFANN

Categories and Subject Descriptors

namreč ne zmore iz podatkov posrkati vsega znanja, prevelika pa

• Computing

methodologies~Artificial

intelligence

se nauči prekomerno. In ne samo to, prava sestava FFANN še

• Computing methodologies~Supervised learning • Computing

vedno ni garancija za uspešno učenje. Med učenjem namreč

methodologies~Supervised

learning

by

classification

zmanjšujemo učno napako na osnovi gradienta, zato se lahko

• Computer systems organization~Neural networks • Theory

zgodi, da zaidemo v lokalni minimum in tam obtičimo. Uteži –

of computation~Design and analysis of algorithms • Theory of

prosti parametri FFANN, ki jih med učenjem optimiziramo, so pri

computation~Machine learning theory

algoritmu vzvratnega razširjanja sprva izbrani naključno, kar

pomeni, da je uspeh učenja odvisen tudi od sreče. Vse našteto nas

je privedlo do iskanja alternativnih učnih algoritmov, med

General Terms

katerimi smo odkrili dva zanimiva.

Algorithms, Design, Reliability, Experimentation, Theory,

2. BIPROPAGATION

Keywords

Prvi novi algoritem se imenuje bipropagation [5, 6] in je manjša

Artificial Neural Network, Machine Learning, Supervised

nadgradnja algoritma vzvratnega razširjanja. Osnovna ideja

Learning, Bipropagation, Border Pairs Method, Multi-layer

izboljšave je v tem, da začetne vrednosti uteži in želene vrednosti

Perceptron

notranjih slojev ne prepuščamo več naključju kot pri vzvratnem

razširjanju, ampak jih izračunamo oziroma določimo. Kako to

1. UVOD

naredimo bomo prikazali kasneje na primeru fotografij.

Prvi modeli umetnih nevronskih mrež (ANN) so nastali pred več

Namenoma smo uporabili izraz »notranji sloj« in ne več »skriti

kot sedemdesetimi leti (Warren McCulloch in Walter Pitts) ter od

sloj«, kot je to običaj pri vzvratnem razširjanju. Če namreč

takrat naprej niso doživeli korenitih sprememb. Izdelava novega

poznamo želene vrednosti notranjih slojev, se lahko lotimo učenja

modela je razmeroma preprosta, saj lahko nevrone tako rekoč

vsakega sloja posebej. Tako razbijemo zahteven problem na več

poljubno povezujemo med seboj. Težji del naloge je najti

manjših, ki jih dobro obvladamo, saj posamični sloji nimajo

algoritem, ki bo omogočal uspešno in učinkovito učenje modela.

lokalnih minimumov. Edini še nerešeni problem je torej določanje

Eden prvih ANN modelov je bil Perceptron, njegov učni

smiselnih želenih vrednosti notranjih slojev. Z intuicijo najdemo

algoritem – Hebbovo pravilo [1] so odkrili šele čez nekaj let in je

rešitev ali dve tudi za ta problem. Najlažje pojasnimo to na

omogočal le reševanje preprostih (linearnih) problemov. Modeli

primeru fotografij obrazov, kjer z MLP ugotavljamo spol

in njihovi učni algoritmi so od takrat naprej postoma napredovali

fotografirane osebe. MLP ima toliko vhodov, kolikor je slikovnih

in so zmogli reševanje vse bolj zahtevnih problemov. Ta napredek

točk (pikslov) in en binarni izhod, kjer velja moški = 0, ženska =

se je dogajal v nekakšnih valovih, pri čemer so se izmenjevala

1. Zaradi lažjega razumevanja bodo vsi notranji sloji imeli enako

obdobja počasnega in hitrega razvoja, podobno kot letni časi.

število nevronov, kot je število vhodov oziroma pikslov. Sicer so

Zatišna obdobja so bila tako izrazita, da so dobila celo svoje ime –

ta števila lahko brez zapletov tudi drugačna. Prvi notranji sloj ima

zima umetne inteligence [2]. Razlog za to naj bi bil psihološke

nalogo, da vse piksle osvetli za en odtenek, če je na fotografiji

narave, kajti prebojna odkritja so privedla do vzhičenja

moški, oziroma potemni, če je ženska. Tako dobimo želene

raziskovalcev, ki so jim nato sledila obdobja razočaranja. Drugi

vrednosti prvega notranjega sloja, ki so zelo podobne vhodnim

36

vrednostim. Ostane nam le še izbira uteži prvega sloja. Če bi slika

tudi realna števila (prav tako na intervalu od 1 do 6) in tako

v prvem notranjem sloju ostala enaka vhodni, bi za uteži ustrezala

opazovali tudi navidezni prostor med piksli.

enotska matrika (nevroni imajo odsekoma linearno prenosno

Ugotovitve so bile naslednje:

funkcijo). Ker je slika dejansko zelo podobna vhodni, je enotska

matrika dobro izhodišče za izračun uteži in zato za dobro

(1) Vsi nevroni, navkljub sigmoidni prenosni funkciji, delujejo v

naučenost zadostuje že malo število učnih iteracij. Prav tako je

zasičenju. Razen v zelo ozkem obmejnem pasu imajo vedno

zaradi male spremembe uteži med učenjem prvega sloja mala tudi

izhodno vrednost enako nič ali ena.

verjetnost, da med učenjem naletimo na lokalni minimum. V

(2) Nevroni v prvem sloju lokalno razmejujejo področja bele in

višjih notranjih slojih se zgodba ponavlja, fotografija postane

črne barve.

glede na spol še svetlejša oziroma temnejša in v zadnjem

(3) Nevroni drugega sloja opravljajo logično operacijo IN tistih

notranjem sloju postane čisto svetla ali temna. Izhodni sloj ima le

nevronov iz prvega sloja, ki omejujejo isto črno področje.

en nevron, ki ima trivialno zadolžitev: če dobi na vhod množico

samih ničel, da na izhodu ničlo, če pa dobi same enice, pa da

(4) Nevron tretjega sloja opravlja logično operacijo ALI nevronov

enico. Ko zaključimo učenje posameznih slojev, lahko z metodo

iz drugega sloja.

vzvratnega razširjanja po želji opravimo še fino uglaševanje.

3.2 Sinteza MLP

Posplošitev tega algoritma na razvrščanje med več razredi je

Za fotografijo na sliki 1 torej velja naslednje:

trivialna. Če bi na primer razvrščali desetiške števke, ne bi temnili

cele slike, ampak samo eno desetino. Za enico bi temnili prvo

•

4 nevroni v prvem sloju (za vsako mejno črto a, b , c in

desetino pikslov, za dvojko drugo in tako naprej.

d po eden)

Glede hitrosti učenja je bipropagation v prednosti zaradi dveh

•

2 nevrona za logično operacijo IN v drugem sloju (za

razlogov. (1) Uteži so že pred učenjem podobne pravim. (2) Ker

vsako področje A in B po eden)

se učijo posamezni sloji, ni možno, da bi spremembe uteži v

•

1 izhodni nevron za logično operacijo ALI

predhodnih slojih motile učenje v trenutnem sloju. Kljub vsem

prednostim pa algoritem bipropagation ostaja iterativen, saj še

vedno optimizira uteži z računanjem gradienta v številnih malih

korakih. Demo program za algoritem Bipropagation je objavljen

na spletu [10].

P1

P2

P3

3. METODA MEJNIH PAROV

3.1 Analiza naučenega MLP



Drugi algoritem se imenuje metoda mejnih parov (Border Pairs

Method, BPM) [6, 7]. Njegovo izhodišče je popolnoma drugačno

kot pri algoritmu bipropagation. Najprej smo MLP naučili do te

mere, da je čisto vse učne vzorce razvrstil pravilno, nato pa smo

Slika 2. Iskanje mejnih parov

opazovali lastnosti notranjih slojev. Za učne podatke smo

P2 in P3 sta mejni par, ker je njun presek krožnic prazen.

uporabili črno-belo fotografijo brez sivih odtenkov – slika 1.

Vhodna podatka sta pikslova vrstica Y in stolpec X (dve celi

Zgradba MLP je s tem že določena (glej sliko 3.), nerešen nam

števili od 1 do 6) izhodni podatek pa barva piksla (0 = bela, 1=

ostane le še drugi del naloge – učenje oziroma iskanje optimalnih

črna). Med opazovanjem notranjih slojev smo dajali na vhod MLP

vrednosti uteži nevronov. V drugem in tretjem sloju se opravljata

logični operaciji IN in ALI. Iskanje optimalnih vrednosti uteži

y

c

d

zanju je trivialno. Torej smo z drugim in tretjim slojem že opravili

6

Barva piksla

5

a

4

A

B



3

b

2

A

B

1

1

2

3

4

5

6

x

Slika 1. Razmejitev pikslov v črno-beli fotografiji

a

b

c

d

a, b, c in d so mejne črte; A in B sta področji



X Y

Slika 3. Zgradba optimalnega MLP

Nekatere sinapse v drugem sloju so izločene.

37

v celoti in nam ostane le še učenje nevronov v prvem sloju, ki

določamo po enakem postopku, kot smo ga uporabili pri prvem

določajo položaj posameznih lokalnih mejnih črt.

sloju v dvodimenzionalnem problemu. Pri tem so izhodni podatki

enega sloja vhodni podatki naslednjega. Vsak sloj ima enako ali

3.2.1 Določanje mejnih črt

manjše število nevronov od predhodnega. Ko ostane le še en

Na sliki 1 so narisane štiri mejne črte (črtkane črte a, b, c in d) v

nevron, je postopek končan.

dvodimenzionalnem prostoru (dimenziji X in Y). Te mejne črte so

Razen za učenje MLP so mejni pari primerni tudi za razšumljanje

položene tako, da razdelijo celoten prostor na homogena

podatkov in ugotavljanje zahtevnosti učnih podatkov že pred

področja. Področji A in B vsebujeta le črne piksle, preostala

pričetkom učenja. Razšumljanje z mejnimi pari poteka tako, da

področja pa samo bele. Pri risanju mejnih črt izhajamo iz

udeležence mejnih parov nekoliko premaknemo v takšni smeri, da

predpostavke, da mejne črte potekajo med črnim in belim piksli,

jih približamo najbližjemu podatku istega razreda, ki ni udeležen

ki so si blizu. V ta namen bližnji črno-bel par poimenujemo

v mejnem paru. Ostalih podatkov sploh ni potrebno korigirati, saj

»mejni par«, pri čemer besedo »bližnji« definiramo s presekom

na učenje vplivajo samo mejni pari.

krožnic. Krožnici narišemo tako, da je en piksel v središču kroga,

drugi pa na krožnici. Če v preseku obeh krožnic ni tretje točke,

Zahtevnost učnih podatkov ugotavljam že pred pričetkom učenja

potem rečemo, da sta si točki blizu, oziroma da mejita. Piksla P

iz deleža podatkov, ki so udeleženi v mejnih parih. Kadar so učni

2

in P

podatki nezahtevni, jih je le nekaj % udeleženih v mejnih parih. V

3 na sliki 2. torej tvorita mejni par. Če bi narisali krožnici za

P

nasprotnem primeru sta dve možnosti: velik šum v podatkih ali pa

1 in P3, bi ugotovili, da je P2 v preseku njunih krožnic ter zato P1

in P

podatki skrivajo v sebi zapletena pravila. Lahko pa je tudi

3 nista mejni par. Slika 4. prikazuje vse mejne pare za našo

črno-belo sliko. Skupaj je na sliki 12 mejnih parov, ki jih sedaj

kombinacija obojega. Podatki v obravnavanem primeru so

želimo ločiti s premicami. Gremo po vrsti in najprej ločimo par 1.

ekvidistančni, a eksperimenti so pokazali, da algoritem deluje

enako dobro tudi pri poljubnem razporedu podatkov.

4. EKSPERIMENTALNO DELO

Prava vrednost vsakega izdelka se seveda pokaže med njegovo

uporabo, zato smo oba algoritma testirali z različnimi učnimi

podatki. Zaradi lažje primerljivosti z drugimi algoritmi in z

1

2

3

4

9

11

drugimi inteligentnimi sistemi smo uporabili uveljavljene učne

podatke. Začeli smo z logično funkcijo ekskluzivni ALI (XOR),

ki je ena od najmanjših učnih množic, a je vseeno zanimiva zaradi

10

12

izrazitega lokalnega minimuma. Oba algoritma sta se izkazala

5

6

7

8

izvrstno. Vzvratno razširjanje je potrebovalo za isto nalogo

petindvajsetkrat več iteracij kot algoritem bipropagation. Rezultati

temeljijo na povprečju množice poskusov. BPM je brez težav

našla dve mejni črti in izvedla eno logično operacijo IN v enem

koraku. Ponavljanje poskusa ni potrebno, ker tu ni naključnih

začetnih vrednosti, ki bi vnesle dejavnik sreče v poskus.

Slika 4. Mejni pari

Naslednja učna množica so bile Perunike (Iris) [8]. To so štiri

dimenzionalni podatki, ki vsebujejo tri razrede in 150 učnih

Dvoglave puščice ponazarjajo mejne pare, ki jih

vzorcev. BPM je že pred pričetkom učenja izločila 146 od 150

sečejo mejne premice

učnih vzorcev in nalogo rešila z enim samim nevronom.

Iz slike je razvidno, da lahko ista premica loči tudi pare 2, 3 in 4.

Najzahtevnejši in najobsežnejši test je bil izveden z zbirko

Tako ubijemo 4 muhe na en mah in dobimo prvo vodoravno

podatkov MNIST, ki obsega slike 70 tisoč rokopisnih števk v

premico. Druga vodoravna premica loči pare 5, 6, 7 in 8. Z dvema

ločljivosti 28*28 slikovnih točk (784 dimenzionalni problem), v 8

navpičnima premicama ločimo še ostale štiri pare in ugotovimo,

bitni barvni ločljivosti (256 odtenkov sive barve) [9]. Metoda

da so skupno potrebne štiri premice. Med postavljanjem premice

vzvratnega razširjanja tukaj popolnoma odpove, bipropagation pa

učimo en nevron z udeleženci mejnih parov. Tako dobimo 4

najde rešitev v zgolj nekaj 10 iteracijah. BPM je bila preizkušena

nevrone v prvem sloju in s tem je znana (skoraj) minimalna

le z delnim naborom podatkov MNIST zaradi omejitev strojne

zgradba celotnega MLP. Za računanje razdalje med točkami

opreme. Rezultat dosežen z delnim naborom MNIST je bil

(polmer krogov) in za ugotavljanje ali par meji, uporabljamo zgolj

podoben rezultatom, ki jih drugi algoritmi dosegajo s celotnim

evklidsko razdaljo, ki je določljiva v prostoru s poljubnim

naborom učnih vzorcev. Pri nobenem od obeh obravnavanih

številom dimenzij. Tako lahko rešujemo probleme v poljubno

algoritmov nismo zaznali nobenih omejitev glede uporabnosti.

dimenzionalnem prostoru. Krožnice v tridimenzionalnem prostoru

Oba sta primerna za razvrščanje poljubnih podatkov.

nadomestijo sfere, v še več dimenzionalnem prostoru pa

hipersfere. V redkih primerih se zgodi, da že ločimo vse mejne

5. ZAKLJUČEK

pare, a ostane kakšno področje še vedno nehomogeno. V tem

Uvodoma smo se seznanili s težavami, ki jih srečamo med

primeru se lotimo iskanja dodatnih mejnih parov, a tokrat le še v

učenjem nevronskih mrež tipa FFANN oziroma MLP: počasnost,

nehomogenem področju. Tako smo med učenjem uporabljali le 20

nezanesljivost, nenatančnost in nekonstruktivnost. V ta namen sta

od 36 pikslov. Učili smo le 4 nevrone od skupno 7 in še te 4

nastala dva nova algoritma: »bipropagation« in »metoda mejnih

vsakega posamično. Posebej pomembno pri tem pa je, da ta

parov« (BPM). Izkazalo se je, da oba nova algoritma odpravljata

algoritem v vsakem primeru najde rešitev. V primeru, ko so učni

mnoge pomanjkljivosti vzvratnega razširjanja in ne prinašata

podatki več kot dvodimenzionalni, si vhodnega prostora ne

nobenih novih. Tukaj izpostavljamo konstruktivnost in ne

moremo narisati in se zato postopek malo spremeni. Vse sloje

iterativnost algoritma BPM. Algoritem BPM je bil do sedaj

38

objavljen le v različici za razvrščanje, v razvoju pa je že tudi

International Electrotechical and Computer Science

različica za regresijo. Zelo koristna za uveljavljanje obeh novih

Conference ERK 2009, Slovenian section IEEE, pp 199-202,

algoritmov bi bila izdelava programske opreme zanju. Bodisi

2009.

njuna vključitev v uveljavljena programska orodja na področju

[6] Bojan Ploj: Advances in Machine Learning Research, 3rd

strojnega učenja, ali pa izdelava njunih knjižnic v uveljavljenih

chapter, Nova Science Publishers, 2014, New York.

programskih jezikih.

[7] Bojan Ploj, Milan Zorman, Peter Kokol: Border Pairs

6. REFERENCE

Method – Constructive MLP Learning Classification

Algorithm, Adaptive and Intelligent Systems,Second

[1] Hebb, Donald Olding, The Organization of Behaviour: A

International Conference, ICAIS 2011, Klagenfurt, Austria,

Neuropsychological Theory, 1949.

September 6-8, 2011. Proceedings

[2] AI winter, Wikipedija,

[8] Iris data set,

https://www.wikiwand.com/en/AI_winter, 15. september

http://en.wikipedia.org/wiki/Iris_flower_data_set, 12. 1.

2016.

2013

[3] Davida E. Rumelharta, Geoffreya E. Hintona in Ronalda J.

[9] Yann LeCun, Corinna Cortes: yan.lecun.com/exdb/mnist,

Williamsa, Learning representations by back-propagating

MNIST , Handwritten digit database

errors, Nature, October 1986.

[10] Bojan Ploj: Bipropagation Demo App,

[4] Irie, Miyake; Capabilities of three-layered perceptrons, IEEE

https://www.mathworks.com/matlabcentral/fileexchange/553

International Conference on Neural Networks, 1988.

89-bipropagation, 10. 9. 2016.

[5] B. Ploj, Bipropagation - nov način učenja večslojnega

perceptrona (MLP), Proceedings of the Eighteenth



39





Ebook Recommendations Based on Stylometric Features

Lovro Žitnik and Marko Robnik-Šikonja

University of Ljubljana, Faculty of Computer and Information Science

Večna pot 113, 1000 Ljubljana, Slovenia

lovro@zitnik.com, marko.robnik@fri.uni-lj.si

ABSTRACT

rational content-based filtering (CB) recommender system.

We present a recommender system for Slovene ebooks using

This approach could be used to solve the cold-start problem,

stylometric features and machine learning. Recommender

where recommendations are required for a new user, and

systems have become a standard component of modern web

other situations with inadequate data for a collaborative fil-

platforms and a popular way for users to find novel items

tering (CF). Using the CB we could also recommend books

or services.

With an increasing number of available Slo-

similar to only one designated book, which greatly reduces

vene ebooks and growing popularity among users, we were

required input of new users.

motivated to build a tool that helps them find their next

book. As a data set we use books in the ePub format, ano-

For the CB we use only the content of ebooks to analyze the

nymized list of users from an e-library system and a set of

writing style of the authors, and do not use other metadata

transactions between books and users. The textual content

(e.g.

information about authors, literary style, keywords

of books is tokenized, POS-tagged and lemmatized. Nume-

etc). In pre-processing we use natural language processing

rical stylometric features of books are extracted and evalu-

techniques and apply statistical methods to obtain 92 nu-

ated by unsupervised feature ranking methods SPEC and

merical stylometric features for each book. The examples of

Laplacian score. The reduced feature set describing books

raw features include average number of syllables in a sen-

and users is used in clustering and classification tasks, inclu-

tence, the percentage of words in direct speech, the percen-

ding one-class. Several variants of recommender algorithms

tage of common and proper nouns, average concreteness of

are presented, based on both content and collaborative fil-

the words etc. Additionally, we compute several well-known

tering. The automatic and user evaluations show that the

readability metrics. We evaluate the importance of features

content-based approach with stylometric features is feasi-

using unsupervised measures Laplacian score and SPEC.

ble, however, the collaborative filtering gives better overall

performance.

We perform a dimensionality reduction and clustering on the

stylometric features of books to reduce noise and produce an

Categories and Subject Descriptors

input data set for five CB algorithms. The list of users and

H.4 [Information Systems Applications]: Miscellane-

transactions are processed in the same way to produce two

ous; D.2.8 [Software Engineering]: Metrics—complexity

data sets as input for two CF algorithms. The first data set

measures, performance measures

uses only profile data, whereas book rankings are added to

the second.

Keywords

In related work Fleischman et al [5] use natural language

stylometric features, ebooks, recommender systems, machine

processing (NLP) techniques for recommendations, but do

learning, Slovene language, clustering, classification

not use stylometry. Vaz et al [12] use writing style and nega-

tive examples to prototype a working recommender system.

1.

INTRODUCTION

Recommender systems (RS) suggest an ordered list of items

The paper is based on [16] and consists of six Sections. In

to users by predicting their relevance. In our case, an algo-

Section 2 we present stylometric features. Section 3 descri-

rithm suggests a personalized list of recommended ebooks.

bes data sets used to train the algorithms described in Sec-

The motivation behind building such a tool were an increa-

tion 4. Section 5 contains the evaluation and comparison

sed number of Slovene ebooks, growing number of users that

of developed approaches. In Section 6 we summarize the

demand easier decision-making and the fact that there was

conclusions and present guidelines for further work.

no available RS for Slovene ebooks.

Stylometry is defined as a statistical analysis of variations

2.

STYLOMETRIC FEATURES

in a literary style for writers or genres. When the preferred

Stylometric features (SF) are numerical descriptions of a

literary style of a user can be determined, books of the same

book’s writing style. They are derived from the content of

style could make usable recommendations.

a book. In our approach we tokenize books into chapters,

paragraphs and sentences. Stopping at the sentence level is

The aim of our work is to test the hypothesis that stylome-

necessary to preserve the context of each separate word for

tric features, derived only from the text, suffice for an ope-

a more precise classification. We perform the lemmatization

40

and POS-tagging, where each word or punctuation is clas-

Users

Transactions

Books

sified as an adjective, verb, adverb, comma etc. Publicly

available Slovene document corpus ccGigafida [4] is used for

CSV

JSON

Raw data

learning. We tested several taggers for speed and accuracy

10813 users

2151834 transactions

1160 ePub 323 PDF

and choose Brill tagger [1].

1147 useful

+ # - *

-

-

*

*

#

From the pre-processed data we derive different SF. Sim-

+ ?

?

+

i

ple features include percentage of sentences with different

Imputation

Interpretation Metadata Raw text

number of commas, number of punctuation marks, average

of transactions

number of syllables per word or sentence etc. More advan-

Tokenisation

+

-

+

-

ced features include overall percentage of the quoted words

+

-

One-hot

or number of sentences with direct speech. A special care

Positive Negative

Sentences and words

Feature extraction

Encoding

ratings

ratings

was taken to preserve pronouns during the analysis. They

Processing

POS-tagging

are usually removed from the analysis via stop-words lists,

Lemmatization

because they are not relevant for analysis based on content.

Statistical methods

Pennebaker [10] advocates that pronouns play a vital role

in determining writing style. Our analysis confirms this, as

< 1, 2, 3 .. 28 > < 1, 2, 3 .. 1175 > < 1, 2, 3 .. 92 >

the feature ”Percentage of pronouns”is ranked third by the

User feature vector nr.1 User feature vector nr.2

Book feature vector

(User profile)

(User book profile)

Laplacian score [6] and fifth by the SPEC algorithm [14] in

a list of total 92 stylometric features. The scores of all SF

are available in [16].

Saving to database

Figure 1. The diagram of feature extraction process.

We compute several readability metrics derived for English

language as described by Mailloux et al [8].

While their

1 to 5). In our case the ratings are implicit and have to

transfer to Slovene language might be questionable, they

be derived from users’ interaction with the e-library web

were used previously [15] and our tests in Table 1 show

application. The transaction ”Confirm rental” was used as

that they are strongly correlated to the text comprehen-

the best approximation for positive rating of the book. The

sibility. The average word concreteness and average age of

best candidate for negative rating was the transaction ”View

word acquisition are also used features, based on [7] and [2].

of the book page” occurring only once, without any further

interactions with the same book.

Table 1. Average readability metrics for different book ca-

tegories. Columns report the Gunning fog index (GFI), Fle-

The ratings for all books were added to the database for all

sch reading-ease score (FRES) and Flesh-Kincaid grade level

the users for the purpose of CF. The attributes for all the

(FKGL).

books contain the value 1 for the positive rating, value -1 for

the negative rating, and value 0 if no rating was extracted

category

sample size

GFI

FRES

FKGL

from transactions. This data set is called ”User book profile”

E-books for Kids

81

7,26

39,76

10,47

(UBP).

E-books for Teens

25

8,41

33,63

11,68

Romance

69

7,91

35,68

11,15

Poetry and Drama

161

8,22

34,21

11,80

For each of the three data sets (books stylometric features,

Natural Science, Technology, Math

5

12,04

10,88

15,93

user profile and user book profile) we computed several de-

Humanities and Social Sciences

96

13,17

10,13

16,63

rivative data sets with dimensionality reduction techniques,

such as PCA and t-SNE. The DBSCAN and k-means algori-

3.

DATABASE

thms were used for clustering, where the number of centers

To form stylometric data sets we used 1047 digitalized ebo-

k was determined by optimization of Silhouette score. The

oks in ePub format. We obtained anonymized user profiles

visualization of one such end result is available in Fig. 2.

of 10,813 users of a Slovene e-library system and 2,151,834

transactions between users and books. We extracted and

4.

RECOMMENDER SYSTEM

consolidated the data into a unified MongoDB database.

The process is depicted in Fig. 1 and results in three data

Using the data described in the previous Section seven diffe-

sets.

rent recommendation algorithms were developed: five content-

based filtering (CB) algorithms and two collaborative filte-

As a result of the feature extraction process, we obtain the

ring (CF) algorithms.

data set with 92 numerical stylometric features for each

book.

We assume that the ratings extracted from the implicit user

transactions are of two significantly different quality levels,

The ”user profile” (UP) consists of a gender, year of birth,

i.e., positive ratings as more reliable than negative ratings.

education and current status of each user. To impute mis-

We describe the tested algorithms using different approaches

sing values for some users, we use model-based imputation

and different subsets of data below.

based on naive Bayes classifier.

4.1

Content-based filtering

The biggest challenge was to extract positive and negative

The average book algorithm (CNTR) calculates the the

book ratings from the transactions.

In recommender sy-

average book (i.e. the centroid) from the positively rated bo-

stems usually explicit user ratings of items are available,

oks of a user and uses nearest neighbor techniques proposed

where each user rates some items with a score (e.g., from

by Sarwar et al [11] to find recommended books.

41





5.

EVALUATION

We used two types of evaluation: offline testing and user

preference testing.

With offline testing, we evaluated the prediction accuracy

of algorithms by using the existing data.

20% of positi-

vely rated books were hidden and the other 80% is used

for learning (i.e. input to recommendation algorithms). By

measuring how many books from the hidden set the recom-

mendation algorithms managed to retrieve we evaluate their

precision and recall.

The distribution of positively rated books has a strong long-

tail characteristics (see Fig. 3). To avoid imbalanced distri-

bution, we divided users into 10 groups (G1 - G10), depen-

ding on a number of positively rated books (2, 3, 4, 5, 7, 10,

18, 28, 50, or 100) and tested each group separately.

Number of users with same number of positively rated books

10000

Figure 2. Clustering of books with k-means (k=12, Silhou-

ette optimized) in 10-dimensional space, additionally redu-

ced to 2-dimensional space with t-SNE.

1000

The nearest books algorithm (NEAR) loops through

100

the list of positively rated books, adding the next nearest

book to the list of recommended books.

Number of users

10

The binary classification algorithm (BIN) is the only

CF algorithm that uses the negative ratings. The positive

1

and negative ratings of books for a user are used as a train-

0

5

10

15

20

25

30

35

40

46

51

58

64

69

79

91

104

126

141

158

ing data for probabilistic binary classifier, which is used for

Number of positively rated books

classification of new books. The returned probability scores

Figure 3. Number of users (logarithmic scale) in sets with

are used to sort the list.

same number of positively rated books.

The one-class SVM algorithm (SVM) is a modified ver-

As a baseline, we included a random book recommendation

sion of Support Vector Machine, using learning from only

(RAND). All seven algorithms were tested with several di-

one class. We discard the negatively rated books and only

fferent settings (different type and level of dimensionality

learn from positively rated books. We use the resulting clas-

reduction, metrics, classifiers etc.). The precision scores for

sifier on a new unrated set of books. This approach is descri-

the best settings of each algorithm are presented in Fig. 4

bed by Manevitz et al [9] and used by Yu et al [13], where

and 5. For details, see [16].

it is reported to have a poor performance.

AVERAGE PRECISION BY GROUP

The PU algorithm (PU) uses two sets of books from a

RAND

CNTR

NEAR

BIN

SVM

PU

CFUPO

CFUBP

particular user, the positively rated books and the unrated

0.12

books. From the set of unrated books it tries to single out

0.1

the books into the negatively rated books set. In this way

0.08

the problem is transformed into a binary classification pro-

blem. we use the algorithm proposed by Elkan et al [3].

0.06

0.04

4.2

Collaborative filtering

0.02

CF algorithms do not use stylometric features. Instead they

0

rely on the history of users’ transactions. We form two clas-

G1

G2

G3

G4

G5

G6

G7

G8

G9 G10

sifiers.

Figure 4. Average precision by group.

OVERALL AVERAGE AND MEDIAN OF PRECISION

The user profile only algorithm (CFUPO) first clusters

average median

users based only on their user profile, then it finds the ne-

0.025

arest user in the same cluster and recommends the books

0.02

of that user, removing the books already rated positively to

0.015

avoid duplication.

0.01

0.005

The user book profile algorithm (CFUBP) differs from

0

the CFUPO only by using the user book profile.

RAND CNTR NEAR BIN

SVM

PU CFUPO CFUBP

Figure 5. Overall average and median of precision.

42

The results show that all the algorithms outperformed the

Natural Language Processing, ANLC’92, pages

baseline random algorithm. CFUPB, which uses user book

152–155, 1992.

profile, scores best, followed surprisingly by SVM, which

[2] M. Brysbaert, A. B. Warriner, and V. Kuperman.

contradicts low expectations expressed in [13]. We assume

Concreteness ratings for 40 thousand generally known

that the worst score for BIN is caused by lack of data for tra-

English word lemmas. Behavior research methods,

ining, whereas other CB algorithms score surprisingly well

46(3):904–911, 2014.

compared to CF.

[3] C. Elkan and K. Noto. Learning classifiers from only

positive and unlabeled data. In Proceedings of the 14th

In evaluation of user preferences we assigned 9 users a

ACM SIGKDD International Conference on

task to select 10 books they like and 5 they do not like.

Knowledge Discovery and Data Mining, KDD ’08,

The age, birth year, education and current status of the

pages 213–220, 2008.

tested persons were also collected. WE supplied this input

[4] T. Erjavec and N. Logar Berginc. Referenčni korpusi

to all 8 recommendation algorithms and assembled a list of

slovenskega jezika (cc)Gigafida in (cc)KRES. In

recommended books for each user. A visual user interface

T. Erjavec and J. Žganec Gros, editors, Zbornik Osme

to rank algorithms from 1 to 8 was provided. The first three

konference Jezikovne tehnologije, pages 57–62,

recommendations were marked with gold, silver and bronze

Ljubljana, 2012.

medals. The average rank and its standard deviation are

[5] M. Fleischman and E. Hovy. Recommendations

presented in Fig. 6.

without user preferences: A natural language

processing approach. In Proceedings of the 8th

AVERAGE RANKING AND STANDARD DEVIATION OF RANKING

International Conference on Intelligent User

average ranking

stand. deviation of ranking

Interfaces, IUI ’03, pages 242–244, 2003.

7

6

[6] X. He, D. Cai, and P. Niyogi. Laplacian score for

5

feature selection. In Y. Weiss, B. Schölkopf, and J. C.

4

Platt, editors, Advances in Neural Information

3

Processing Systems 18, pages 507–514. 2006.

2

[7] V. Kuperman, H. Stadthagen-Gonzalez, and

1

0

M. Brysbaert. Age-of-acquisition ratings for 30,000

CFUBP CFUPO NEAR

PU

CNTR

BIN

RAND SVM

English words. Behavior Research Methods,

Figure 6. The algorithm ranking by users.

44(4):978–990, 2012.

[8] S. L. Mailloux, M. E. Johnson, D. G. Fisher, and T. J.

The results show that users prefer CF recommendations,

Pettibone. How reliable is computerized assessment of

which took first two places, before all CB algorithms. Sur-

readability? Computers in nursing, 13:221–221, 1995.

prisingly, the random algorithm ranked second to last. This

[9] L. M. Manevitz and M. Yousef. One-class SVMs for

could be attributed either to the users desire for novel books

document classification. J. Mach. Learn. Res.,

or to a noise, since users reported that ranking lower ranked

2:139–154, Mar. 2002.

suggestion was difficult.

[10] J. W. Pennebaker. The Secret Life of Pronouns: What

Our Words Say About Us. Bloomsbury USA, 2011.

6.

CONCLUSIONS

[11] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl.

We propose the content-filtering recommender system for

Application of dimensionality reduction in

books using only stylometric features.

We propose seve-

recommender system-a case study. Technical report,

ral content-based filtering algorithms using only numerical

DTIC Document, 2000.

stylometric features extracted from book content. We ben-

[12] P. C. Vaz, D. Martins de Matos, and B. Martins.

chmark them against two collaborative filtering algorithms

Stylometric relevance-feedback towards a hybrid book

and random recommendations.

recommendation algorithm. In Proceedings of the Fifth

ACM Workshop on Research Advances in Large

The results of both automatic evaluation and user preferen-

Digital Book Repositories and Complementary Media,

ces indicate that recommender systems using only stylome-

BooksOnline ’12, pages 13–16, 2012.

tric features are feasible, but they are inferior in performance

[13] H. Yu, W. Zuo, and T. Peng. A new pu learning

compared to collaborative filtering recommendations.

algorithm for text classification. In Mexican

International Conference on Artificial Intelligence,

We assume that a significant improvement of content-filtering

pages 824–832, 2005.

algorithms could be obtained by gathering explicit ratings of

[14] Z. Zhao and H. Liu. Spectral feature selection for

books. Incorporating the word-space similarity as described

supervised and unsupervised learning. In Proceedings

in [5] could also prove beneficial. For a real-world book re-

of the 24th International Conference on Machine

commender system we recommend development of a hybrid

Learning, ICML ’07, pages 1151–1157, 2007.

algorithm that combines both content-based and collabora-

[15] A. Zwitter Vitez. Ugotavljanje avtorstva besedil:

tive filtering approaches.

This could solve the cold start

primer ”trenirkarjev”. In T. Erjavec and

problem.

J. Žganec Gros, editors, Zbornik Osme konference

Jezikovne tehnologije. Ljubljana, 2014.

7.

REFERENCES

[16] L. Žitnik. Recommender system for slovene electronic

[1] E. Brill. A simple rule-based part of speech tagger. In

books. University of Ljubljana, Faculty of Computer

Proceedings of the Third Conference on Applied

and Information Science, 2016. Diploma thesis.

43





Model selection on the JSI grid: Metis use-case

Jernej Zupančič

Damjan Kužnar

Matjaž Gams

“Jožef Stefan” Institut

“Jožef Stefan” Institut

“Jožef Stefan” Institut

Jamova cesta 39

Jamova cesta 39

Jamova cesta 39

Ljubljana, Slovenia

Ljubljana, Slovenia

Ljubljana, Slovenia

jernej.zupancic@ijs.com

damjan.kuznar@ijs.com

matjaz.gams@ijs.com

ABSTRACT

3. Gradients are usually not available.

The paper presents model selection using a random search

method on the Jožef Stefan Institute (JSI) grid infrastruc-

ture.

Best practices for model selection on the grid are

Additionally, in real life scenarios one does not have the

presented through the use-case, where the dataset from the

access to the ground truth distribution Gx, therefore, the

Metis project was used. The model selection task was to find

value that we wish to optimize, i.e. expectation over the

the best learning algorithm and its parameters according to

unknown distribution, is impossible to evaluate. In order to

the area under the curve performance metric. Although the

mitigate the problem of Gxs unavailability, cross-validation

gradient boosting algorithm performed best in the experi-

is often used. In cross-validation the expectation is replaced

ment, logistic regression was chosen for the implementation

with a mean loss over validation set Dvalid. Cross-validation

in the Metis prediction module due to the high model trans-

is unbiased as long as Dvalid is independent of the data used

parency and good performance.

by Aλ [4]. Therefore, the problem 1 is approximated by the problem

1.

INTRODUCTION

A∗, λ∗A ≈ argmin

mean

L (x; A

A∈A,λ∈Λ

x∈Dvalid

λ(D)).

(2)

The objective of a learning task T = (L, D) is to find an algo-

rithm A and the algorithm parameters λ so that the function

Since the problem 2 typically involves an evaluation of sev-

f = Aλ (D), which is the result of the use of training algo-

eral (A, λ) pairs, which in turn usually require an inner opti-

rithm on a dataset, minimizes some expected loss L(x; f )

mization run, the problem is computationally intensive. The

over i.i.d. samples x from a ground truth distribution Gx.

use of a cluster of computers is therefore required in order to

Algorithm A is a functional that maps a dataset D, which

obtain satisfiable solutions in a reasonable time frame. JSI

is a finite set of samples from Gx, to f . Algorithm A usually

grid1 was used in the presented experiments.

performs an optimization of some loss metric with respect to

parameters θ that correspond to the internal representation

The rest of the paper is structured as follows. In section 2

of f by A (e.g. weight values in a neural network). Addi-

related work in model selection and hyper-parameter tuning

tionally, the algorithm A often has adjustable parameters

is presented, in section 3 the experimental setup is described,

λ called hyper-parameters (e.g. number and sizes of hidden

in section 4 the results from the experiment are presented,

layer in a neural network). Those hyper-parameters often

and section 5 concludes the paper.

determine the performance of the algorithm and their setting

is crucial for obtaining an algorithm that effectively learns

a function f on the training dataset D and performs well

2.

RELATED WORK

on new and previously unseen samples from G

Model selection and hyper-parameter tuning problems are

x. Therefore,

an algorithm and its settings that minimize generalization

usually addressed by manual search and grid search or a

error

combination of the two (e.g. [10, 6]). Although several model

and hyper-parameter optimization methods have been intro-

Ex∼G [L(x; A

x

λ(D))]

duced in the recent years such as sequential model based op-

are required. The problem of finding the best model and its

timization [7, 3] or efficient global optimization algorithm [8,

parameters with respect to the generalization error is called

1] researchers and practitioners still prefer a combination of

model selection, while the problem of finding only the best

manual and grid search. This combination does have some

parameters for a chosen algorithm is called hyper-parameter

nice properties:

tuning. This paper presents methods (and an application of

one such method) to address the model selection problem:

1. It enables the practitioner to build the intuition about

A∗, λ∗

the performance of the models and their parameters.

A = argminA∈

E

[L(x; A

A,λ∈Λ

x∼Gx

λ(D))].

(1)

The optimization problem 1 is hard due to several reasons:

2. It enables the practitioner to focus on models and set-

tings that are most promising.

1. The problem is usually non-convex.

3. It is effective in lower dimensions.

2. Search space is complicated and hierarchical.

1https://www.ijs.si/ijsw/nsc/Uporaba%20gru%C4%8De%20IJS

44





4. Grid search is simple to implement and embarrassingly

– max depth ∈ {5, 8, 15, 25, 30, N one}

parallelizable.

– min samples split ∈ {1, 2, 4, 10, 15, 100}

– min samples leaf ∈ {1, 2, 5, 10}

However, it has the disadvantage of not being reproducible

– max features ∈ {0log20,0 sqrt0, N one}

and not being effective in the cases of large search spaces

and higher dimensions.

• KNeighborsClassifier:

State-of-the-art approaches for model selection and hyper-

– n neighbors ∈ {0l01,0 l02}

parameter tuning are based on sequential model based opti-

– p ∈ {0.001, 0.01, 0.1, 1, 10, 50, 100}

mization approaches and have already been implemented in

•

several software libraries such as AutoWEKA2, AutoSklearn3,

SVC:

Spearmint4, Hyperopt5, etc. The problem with the sequen-

– C ∈ {0.001, 0.01, 0.1, 1, 10, 100, 1000}

tial model based optimization methods is that they are not

–

simple to parallelize and are difficult to implement. In the

gamma ∈ {0auto0, 0.01, 0.05, 0.1, 0.2}

parallel implementation the authors usually choose a trade-

– class weight ∈ {0balanced0, N one}

off between effective modelling of the problem and the time

• XGBClassifier:

speed-up enabled by the use of the cluster [3].

– learning rate ∈ {0.01, 0.025, 0.05, 0.1}

Although not state-of-the-art any more the random search

–

is another method for model selection and parameter tuning

n estimators ∈ {120, 300, 500, 800, 1200}

which performs very well on high dimensional problems [2].

– gamma ∈ {0.05, 0.3, 0.7, 1.}

Additionally, random search is simple to implement and par-

– max depth ∈ {3, 6, 12}

allelize, while being much more efficient than a grid search.

– min child weight ∈ {1, 3, 7}

3.

EXPERIMENTAL SETUP

– subsample ∈ {0.6, 0.8, 1.}

Random search method for model selection was implemented

– colsample bytree ∈ {0.6, 0.8, 1.}

for a use in a multi-node setting. Each worker ran indepen-

– reg lambda ∈ {0.01, 0.025, 0.05, 0.075, 0.1, 1.}

dently and it followed the following steps:

– reg alpha ∈ {0, 0.1, 0.25, 0.5, 0.75, 1.}

1. Set-up an environment on a computational node.

The search space includes 235 538 possible combinations.

2. Sample from available algorithms and corresponding

Taking into account the time required to build the model

parameters.

in the case of random forest or tree boosting (about 1 hour

as observed from the log files) a full grid search was not

3. Evaluate the performance of a chosen model.

feasible.

4. Report the performance of the model to the central

node.

3.1

The data

The data about student performance analyzed in the Metis

5. Repeat from step 2.

project was used to provide a use-case. The data includes

the same features as presented in [9], however, the informa-

tion from more students was added. In the use-case the in-

Learning algorithms available in established Python 3.5 li-

formation of 70 155 unique students at four subjects (Math-

braries Scikit-learn [11] (logistic regression, random forest,

ematics, Slovene, English and Chemistry) was used, which

naive bayes, nearest neighbours, support vector classifier)

translates to 1 736 979 training instances.

Each instance

and XGBoost [5] (tree boosting) were selected as possible

represents a state of the pupil at the end of a month (e.g.

algorithms. MongoDB6 was used to store the results. The

October) at one subject (e.g. Mathematics). This should

following parameter options (according to the algorithm im-

result in 10 instances for one pupil for one subject per a

plementation) were available:

school year, however, since the data for December was not

available only 9 instances per school year were available. A

• LogisticRegression:

total of 32 features was constructed taking into account the

grades in current and previous grading periods and simple

– penalty ∈ {0l01,0 l02}

statistics derived from the grades (min, max, mean, devi-

– C ∈ {0.001, 0.01, 0.1, 1, 10, 50, 100}

ation, difference), the number of excused and not excused

absences, time remaining to the end of the grading period,

• RandomForestClassifier:

and the number of times a pupil was missing from the ex-

amination. The goal was to predict whether a pupil will

– n estimators ∈ {120, 300, 500, 800, 1200}

have a negative grade at the end of the grading period. The

2http://www.cs.ubc.ca/labs/beta/Projects/autoweka/

dataset included 114 414 of positive instances (i.e. student

3http://automl.github.io/auto-sklearn/stable/

had a negative grade at the end of the grading period) and

4https://github.com/HIPS/Spearmint

1 622 565 of negative instances (i.e. student had a positive

5http://hyperopt.github.io/hyperopt/

grade at the end of the grading period). Due to sensitive

6https://www.mongodb.com/

personal information the data is not publicly available.

45





3.2

Model evaluation

GaussianNB

We have used an area under the curve (AUC), standard met-

ric for the evaluation of binary classification models, with

LogisticRegression

few adjustments. To evaluate a model, instances were first

divided by month. The 5-fold cross validation was used on

RandomForestClassifier

the instances of the same month to get a cross validation

probability prediction for every instance.

Using the pre-

model

XGBClassifier

dicted probability for every instance AUC metric was cal-

culated for every month, which was the result of the model

KNeighborsClassifier

evaluation.

SVC

3.3

The grid

0.80 0.82 0.84 0.86 0.88 0.90 0.92 0.94 0.96

JSI computing cluster, which is a part of SLING - Slovenian

mean_scores

initiative for national grid, is accessed through the member-

ship at nsc.ijs.si virtual organization. The cluster enables

Figure 1: A box-plot of scores grouped by a model.

the submission of batch jobs using the Advanced Resource

Connector (ARC)7 middleware and has 1 984 CPUs avail-

able for computational tasks. Every researcher also has the

right to be a member of the gen.vo.sling.si virtual orga-

4.2

The grid experience

nization used for submitting jobs to common Arnes cluster,

Although the availability of the grid is a big bonus for the

however, this cluster is usually occupied and the times for

JSI researchers, it still has some issues that have to be taken

the jobs to start and eventually finish are much longer than

into account when writing software that is to be run on the

using the JSI cluster.

grid.

The connection between nodes in a grid is not avail-

4.

RESULTS

able (at least through Python and SSH). This makes

the deployment of software designed for multiple nodes not

4.1

Metis use-case results

possible.

The experiment was run on 20 computational nodes in the

grid for 24 hours. We have obtained 112 model evaluations.

When an error occurs in a job the whole job is can-

The reason for such a low number of successful model evalu-

celled. Even when multiple processes are run on one node,

ations was the failure of computing nodes on the grid, which

it the program in the process results in an error, the whole

will be explained in more details in section 4.2. As described

job is terminated.

in 3.2 evaluation values were obtained for every month dur-

ing the school year. In order to compare between them a

Only batch jobs are possible. While some problems are

mean of those values was calculated for each model eval-

appropriate for a batch job grid usage type, others are not.

uation score.

It was observed (Figure 1) that XGBClas-

Especially the data mining problems, where the practitioner

sifier was performing best no matter the parameter values

tries several approaches and requires instant feedback in or-

used (mean score of 0.939 AUC), followed by RandomForest-

der to proceed with the next step.

Classifier (mean score of 0.937 AUC) and LogisticRegression

(mean score of 0.935 AUC). All results are presented in Fig-

The following best practices are proposed when working with

ure 2. GaussianNB model was evaluated only once, since it

the JSI grid:

does not require the setting of parameters, while SVC model

was also evaluated only once, however, that was due to the

Parallelize your code so that each worker can work

fact that all other attempts to evaluate the SVC model on

on independent task. It would be best if no communi-

the given data resulted in error.

cation between workers is required, the only communication

would be the reporting of results to your local database,

Although the XGBClassifier was performing best, we have

which is accessible to the grid inside the JSI. This will en-

chosen LogisticRegression for the use in the Metis prediction

able you to launch as many workers as you wish.

module, since it is a transparent model and it performs sim-

ilar to the best performing XGBClassifier (when parameters

Use only one CPU per worker/job. In the case of error

are set to the values found in the search that perform best

you will loose only one computation node.

in case of LogisticRegression - penalty =0 l01, C = 1).

Use local testing environment for testing the scripts

We have also ranked the models according to the Pareto

you are about to submit as a job to the grid. Best,

front dominance (Figure 3). Although the first front does

use container technologies (e.g. Docker8) to start a fresh

present some valuable insight (only XGBClassifier and Ran-

Linux environment (as used by the grid), transfer all the

domForestClassifier appear on the first front and are there-

required scripts and data to the environment and perform a

fore non-dominated) the information from other second and

test (e.g. in a case of data mining task a test could be done

other fronts does not provide additional information, since

with smaller amount of data).

the fronts strive for diversity.

7http://www.nordugrid.org/arc/about-arc.html

8https://www.docker.com/

46





1.0

0.8

0.6

GaussianNB

KNeighborsClassifier

0.4

LogisticRegression

mean_scores

RandomForestClassifier

0.2

SVC

XGBClassifier

0.0 0

20

40

60

80

100

sorted_index

Figure 2: Mean scores over all months for all evaluations.

18

6.

REFERENCES

model

16

[1] T. Bartz-Beielstein and S. Markon. Tuning search

GaussianNB

algorithms for real-world applications: A regression

14

KNeighborsClassifier

LogisticRegression

tree based approach. In Evolutionary Computation,

12

RandomForestClassifier

SVC

2004. CEC2004. Congress on, volume 1, pages

10

XGBClassifier

1111–1118. IEEE, 2004.

8

[2] J. Bergstra and Y. Bengio. Random search for

hyper-parameter optimization. Journal of Machine

6

appearance_in_front

Learning Research, 13(Feb):281–305, 2012.

4

[3] J. S. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl.

2

Algorithms for hyper-parameter optimization. In

0

Advances in Neural Information Processing Systems,

0

1

2

3

4

5

6

7

8

9

10

front

pages 2546–2554, 2011.

[4] C. M. Bishop. Neural networks for pattern recognition.

Oxford university press, 1995.

Figure 3: Models ranked by Pareto dominance.

[5] T. Chen and C. Guestrin. Xgboost: A scalable tree

boosting system. arXiv preprint arXiv:1603.02754,

2016.

Use scripts for submitting the jobs and obtaining

[6] G. Hinton. A practical guide to training restricted

the logs.

boltzmann machines. Momentum, 9(1):926, 2010.

[7] F. Hutter. Automated configuration of algorithms for

5.

CONCLUSION

solving hard computational problems. PhD thesis,

The paper presents the model selection and hyper-parameter

University of British Columbia, 2009.

tuning problems. Since the problem is computationally de-

[8] D. R. Jones, M. Schonlau, and W. J. Welch. Efficient

manding JSI grid infrastructure was used in combination

global optimization of expensive black-box functions.

with a simple random search method for optimization in

Journal of Global optimization, 13(4):455–492, 1998.

complex search space. Methods were applied to the Metis

[9] D. Kuznar and M. Gams. Metis: system for early

use-case, where the goal was to predict whether a pupil will

detection and prevention of student failure. In

have a negative grade at the end of the grading period. Using

6thInternational Workshop on Combinations of

random search on the grid we have found a well performing

Intelligent Methods and Applications (CIMA 2016),

white-box model that will be applied in production.

page 39, 2016.

[10] H. Larochelle, D. Erhan, A. Courville, J. Bergstra,

The experience in working with the JSI grid infrastructure

and Y. Bengio. An empirical evaluation of deep

was also described. It was found that connectivity issues

architectures on problems with many factors of

between nodes, resource issues on the nodes themselves, and

variation. In Proceedings of the 24th international

batch job submissions prevent the user from fully exploiting

conference on Machine learning, pages 473–480. ACM,

the grid resources. However, the grid can still be exploited

2007.

by following the proposed best practices. By increasing the

[11] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,

awareness of the grid availability among the JSI researchers

B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,

we expect that several issues that are now present will be

R. Weiss, V. Dubourg, J. Vanderplas, A. Passos,

removed in the future. This will enable JSI researchers to

D. Cournapeau, M. Brucher, M. Perrot, and

use the grid to its full potential and give the JSI advantage

E. Duchesnay. Scikit-learn: Machine learning in

in national and international research projects.

Python. Journal of Machine Learning Research,

12:2825–2830, 2011.

All scripts and data for this paper are available at https:

//repo.ijs.si/jernejzupancic/IS2016/tree/master.

47





Sinteza slovenskega govora na mobilni platformi Android

Tomaž Šef



Institut “Jožef Stefan”



Jamova cesta 39



1000 Ljubljana

+386 1 477 34 19

tomaz.sef@ijs.si



POVZETEK

verziji prav tako podpirata »on-line« sintezo, ki zahteva

podatkovno povezavo, s to razliko, da so novo razviti glasovi, ki

V članku predstavljamo mobilno aplikacijo in e-storitev, ki

zagotavljajo občutno višjo stopnjo naravnosti rezultirajočega

omogoča sintezo slovenskega govora na mobilni platformi

sintetičnega govora, dosegljivi le preko eBralca On-Line.

Android. Podpora se vgradi v sam operacijski sistem mobilne

naprave, kar zagotavlja povezljivost s poljubnim programom

nameščenim

2. E-STORITEV

na tej napravi. Opisana rešitev je nepogrešljiva za

E-storitev omogoča vgradnjo poljubnega Microsoft SAPI

slepe in slabovidne ter osebe z motnjami branja, ker jim omogoča

rabo različnih orodij za delo z mobilnimi

kompatibilnega (slovenskega) sintetizatorja govora v operacijski

napravami, učenje in

samopomoč. Koristna

sistem mobilnih naprav. E-storitev je sestavljena iz strežnika in

je tudi za vse druge uporabnike mobilne

mobilne aplikacije.

platforme Android, saj je govor velikokrat najbolj primerna in

naravna oz. včasih celo edina varna oblika posredovanja informacij

Strežnik na eni strani omogoča povezavo s sintetizatorjem govora,

in drugih vsebin.

na drugi strani pa z mobilno napravo. Sintetizator govora je lahko

Ključne besede

naložen bodisi na istem strežniku kot sama strežniška aplikacija



bodisi se poveže z nekim oddaljenim strežnikom z nameščenim

Govorni bralnik besedil, sinteza slovenskega govora, mobilna

sintetizatorjem govora (npr. s strežnikom eGovorec). Sintetizator

platforma Android.

govora lahko razvijalci sproti posodabljajo, ne da bi uporabniki za

to sploh vedeli oz. jim za uporabo najnovejše različice ni potrebno

1. UVOD

narediti čisto nič (ni nobene potrebe po prenosu sintetizatorja

Razvili smo mobilno aplikacijo in e-storitev, ki omogoča sintezo

govora, saj se ta posodobi na samem strežniku).

slovenskega govora na mobilni platformi Android. Sistem je

Mobilna aplikacija na platformi Android omogoča vgradnjo

zasnovan v obliki strežnika v oblaku in pripadajočih aplikacij.

glasov vseh v strežnik prijavljenih sintetizatorjev govora v

Podpira industrijski standard Android Speech API, kar zagotavlja

operacijski sistem mobilne naprave. V nastavitvenem meniju

vključitev sintetizatorja govora v sam operacijski sistem in

mobilne naprave se pri govornem bralniku avtomatsko pojavijo še

posledično njegovo povezljivost s poljubnimi programi na tej

slovenski glasovi.

razširjeni mobilni platformi. Po kvaliteti nekoliko slabši sintetizator

slovenskega govora se nahaja na sami mobilni napravi (off-line

Arhitektura obeh razvitih e-storitev je podobna Tako DysLex

različica) in omogoča branje tudi ob izklopljeni ali nedosegljivi

kot eBralec KSS znata samodejno preklapljati med 3G in Wifi

podatkovni povezavi. Bolj kakovosten in naraven govor je

povezavo, pri čemer lahko rabo mobilne podatkovne povezave

dosegljiv preko e-storitve v oblaku (on-line različica), ki za svoje

(3G) tudi onemogočimo. Vgrajeni senzor prenosa podatkov na

delovanje potrebuje podatkovno povezavo (3G ali Wifi) z

mobilni napravi zagotavlja spremljanje in nadzor nad porabo

oddaljenim govornim strežnikom.

podatkovnega prenosa. Ob odsotnosti kakršnekoli podatkovne

povezave sistema avtomatsko preklopita na (po kvaliteti sicer

Opisana rešitev je nepogrešljiva za slepe in slabovidne ter osebe z

slabšo) »off-line« sintezo govora. Ko je povezava z govornim

motnjami branja, saj jim omogoča rabo različnih orodij za delo z

strežnikom ponovno mogoča, pa se sinteza govora začne zopet

mobilnimi napravami, učenje in samopomoč. Najbolj razširjena

izvajati v boljši kvaliteti.

programska oprema je že prilagojena njihovim specifičnim

potrebam. Uporabna pa je tudi za vse druge uporabnike mobilne

Arhitektura razvite e-storitve DysLex s strežniško aplikacijo v

platforme Android, saj je govor velikokrat najbolj primerna in

Arnesovem oblaku in pripadajočo mobilno aplikacijo

naravna oz. včasih celo edina varna oblika posredovanja informacij

(odjemalcem) na mobilni napravi je prikazana na sliki 1

in drugih vsebin (npr. branje elektronske pošte med vožnjo).

(http://dis.ijs.si/dyslex/). Sintetizatorji govora se nahajajo bodisi v Arnesovem oblaku (MS SAPI 5 združljivi) bodisi na samostojnem

Trenutno obstajata dve različici programa oz. e-storitve:

strežniku (e-Govorec).

 eBralec KSS (https://play.google.com/store/apps/details?id=si.

ijs.dis. ebralec) je namenjen slepim in slabovidnim, osebam z

Pri e-storitvi eBralec KSS se strežniška aplikacija in novo razviti

motnjami branja ter zaposlenim v javni upravi,

sintetizator govora eBralec nahajata izključno na strežniku

 DysLex [1] (https://play.google.com/store/apps/details?id=si.

Knjižnice slepih in slabovidnih. eBralec KSS zna dodatno

ijs. dyslex) je brezplačna storitev, ki je splošno dostopna in upravljati z vgrajeno zaščito novega sintetizatorja govora, nastavlja

vsem na razpolago.

se lahko tudi različne parametre »Besednika« in »Analizatorja«.

Androidna aplikacija eBralec KSS omogoča še vpis testnega

Obe verziji vključujeta »off-line« sintetizator slovenskega govora,

besedila, rabo odložišča ter posreduje (govorna) navodila o rabi

ki deluje tudi takrat, ko podatkovna povezava ni na razpolago. Obe

programa (slika 2).

48





DysLex

ARNESOV



strežniška

eGovorec

OBLAK

aplikacija



…

DysLex

Govorec Proteus FreeTTS

mobilna ali

FreeTTS

(združljivi z MS SAPI 5)

(strežnik

spletna aplikacija



v oblaku)





Slika 1. Arhitektura e-storitve DysLex.



E-storitev se zelo preprosto nadgrajuje in posodabljala. Rešitev

 branje izbranega besedila v poljubni aplikaciji (za

je horizontalno skalabilna, s čimer je omogočeno optimalno

lektoriranje lastnega dela dislektikov [2], branje knjig,

prilagajanje strojnih virov dejanskim potrebam oz. rasti števila

uporabnikov in vsebin; omogočena je hitra vzporedna obdelava na

izgovorjava posameznih besed, branje elektronskih

večprocesorskih računalnikih, podprta je možnost namestitve n

izročkov, dostopanje na internet),

a



grozd računalnikov (v oblaku). Sintetizatorje govora lahko

 branje zaslona mobilne naprave,

razvijalci posodabljajo sproti, ne da bi uporabniki za to sploh

 avtomatsko branje samodejnih popravkov in velikih

vedeli. Slednji zgolj opažajo, da je govor čedalje bolj razumljiv in

začetnic,

naraven ter da se v sistemu pojavljajo novi glasovi v slovenskem

 branje menijev v načinu za slepe in slabovidne oz.

jeziku.

drugače prizadete (funkcija TalkBack).

Vgradnja govornega bralnika v sam operacijski sistem mobilne

Pri uporabi sintetizatorjev govora je namreč zelo pomembno, da je

naprave (s podporo standardu Android Speech API, slika 3)

zagotavlja največjo uporabniško dostopnost in najboljšo

podprto čim več tipov elektronskega gradiva (npr. tudi pdf datoteke

in internetni viri) ter čim več že obstoječe programske opreme.

uporabniško izkušnjo. Avtomatsko začnejo delovati vse z

Slovenski glasovi sedaj delujejo v vseh aplikacijah, ki uporabljajo

operacijskim sistemom podprte funkcionalnosti govornega

govorni bralnik operacijskega sistema Android; vključno z vsemi

bralnika mobilne naprave tudi v slovenskem jeziku. Takšne

aplikacijami za dislektike ter slepe in slabovidne.

funkcionalnosti so npr.:





Slika 2. Androidna aplikacija eBralec KSS

49





Slika 3. Vgradnja eBralca v izbirnike operacijskega Sistema Android





Slika 4. Shema sistema eBralec za tvorjenje govora z uporabo PMM [4]



Strežnik e-storitve DysLex je dostopen tudi programerjem, ki

“country”:”SI”,

bi želeli podporo sinteze (slovenskega) govora vnesti neposredno v

“localVoice”:true,

svoje programe. Odjemalec, ki je bodisi spletna stran bodisi

“remoteServerName”:””,

aplikacija, pošlje želeno besedilo na strežnik HTTP. Strežnik

“remoteServerIP”:”localhost”}

pretvori besedilo v govor, ki ga v obliki zvočnega zapisa pošlje

Funkcija speak

nazaj odjemalcu v predvajanje. Strežnik se nahaja na IP naslovu

141.ablak.arnes.si in posluša na vrata 80. Splošne funkcije:

 Namenjena je pretvorbi besedila v govor.

 Parametri:

Funkcija info

f=speak – strežniku pove, da hoče pretvorbo besedila v zvok

 Namenjena je splošni poizvedbi o glasovih, ki so nameščeni na

t=”<besedilo>” – besedilo, ki bo pretvorjeno v zvok

strežniku DysLex. Odjemalec praviloma najprej kliče to

v=”<ime glasu>” – ime glasu za sintezo govora

funkcijo, da lahko (po potrebi) v svojem uporabniškem

s=”<hitrost>” – hitrost govora

vmesniku prikaže spisek vseh razpoložljivih glasov.

o= “<mpeg oz. wav>” – kodiranje

 Zahteva tipa GET (GET Request)

 Primer sinteze:

Primer: http://141.ablak.arnes.si/?f=info

http://141.ablak.arnes.si/?f=speak&t=Primer&v=Renato%20

 Odgovor tipa application/json

Govorec&s=0&o=mpeg

Pomembni parametri odgovora so: id, name, lang, country in

Funkcija heartbeat

age

 Primer odgovora v JSON formatu:

 Namenjena je za preverjanje delovanja strežnika.

{“id”:”GovRenato”,

 Odgovor s statusom 200 OK in besedilo Server is working!

“name”:”Renato Govorec”,

 Primer zahteve: http://141.ablak.arnes.si/?f=heartbeat

“lang”:”slv”,

Spletna stran projekta DysLex s podrobnejšimi informacijami se

“gender”:”Male”,

nahaja na naslovu http://dis.ijs.si/dyslex/.

“age”:”Adult”,

50

3. SINTETIZATOR GOVORA

4. ZAKLJUČEK

Umetno generirani govor mora zveneti naravno in biti prijeten za

Predstavljeni e-storitvi in mobilni aplikaciji omogočata podporo

poslušanje [3]. Pomembne so tudi nastavitve za hitrost branja in

sintezi slovenskega govora v operacijskem sistemu Android, kar

jakost zvoka ter možnost uporabe različnih glasov.

pomeni, da lahko umetno generirani govor poslušamo v vseh

eBralec (http://ebralec.si) je najnovejši sintetizator slovenskega aplikacijah (npr. PDF reader, brskalnik), ki sintezo govora

podpirajo in so naložene na

govora [4]. Prvenstveno je namenjen slepim in slabovidnim

teh napravah. Za operacijska sistema

iOS in Windows Phone takšna vgradnja žal še ni možna, ker

uporabnikom ter osebam z motnjami branja. V primerjavi s

proizvajalca operacijskega sistema te funkcionalnosti zaenkrat še

preostalimi (predhodnimi) sintetizatorji govora za slovenski jezik

omogoča občutno višjo stopnjo naravnosti sintetičnega govora.

ne omogočata.

Aplikacija eBralec je plod sodelovanja Instituta »Jožef Stefan« ter

Na mobilnih napravah z Androidom lahko dislektiki ter slepi in

podjetij Alpineon in Amebis. Vsebuje moški (eBralec Renato) in

slabovidni sedaj uporabljajo vse (učne) pripomočke, izdelane za

ženski (eBralec Maja) glas. Temelji na obsežnih govornih korpusih

globalni trg ali že standardno vgrajene v napravo.

(tabela 1) in sintezi govora s prikritimi markovovimi modeli

Naravnost in razumljivost novega sintetizatorja slovenskega

(PMM; slika 4). Na platformi Android je dosegljiv preko aplikacije

govora eBralec (http://ebralec.si/) sta primerljivi s sintetizatorji eBralec KSS v »on-line« načinu delovanja.

govora za druge jezike. Poslušanje takšnega govora ni več naporno,

Velikost

besednega 7.145.345 povedi

zato je sintetizator primeren za najširši krog uporabnikov.

korpusa

77 milijonov besed

5. ZAHVALA

Obseg

govorne

zbirke

Posebna zahvala gre Dejanu Kostadinovskemu in Janiju Bizjaku,

4.000 povedi

ki sta sprogramirala strežniško in mobilno aplikacijo ter zagotovila





46.785 besed

podporo standardu Android Speech API.

6 ur 3 min posnetkov za

Operacijo je delno sofinancirala Evropska unija iz Evropskega

ženski glas

sklada za regionalni razvoj ter Ministrstvo za izobraževanje,

znanost in šport. Operacija se je izvajala v okviru Operativnega

5 ur 33 min posnetkov za

moški

programa krepitve regionalnih razvojnih potencialov za obdobje

glas

2007-2013,

razvojne

prioritete:

Gospodarsko

razvojna

Število različnih difonov v 1.883

infrastruktura; prednostne usmeritve Informacijska družba.

zbirki

Razvoj eBralca je bil delno financiran v okviru projekta Knjižnica

Število različnih trifonov v 21.369

slepih in slabovidnih Minke Skaberne. Operacijo je delno

zbirki

financirala Evropska unija iz Evropskega socialnega sklada.

(24.702)

Operacija se je izvajala v okviru Operativnega programa razvoja

(št. kombinacij v korpusu)

človeških virov, razvojne prioritete "Enake možnosti in

spodbujanje socialne vključenosti", prednostne usmeritve "Dvig

Tabela 1. Podatki o govorni zbirki eBralca [4]

zaposlenosti ranljivih družbenih skupin na področju kulture in

podpora njihovi socialni vključenosti".

Govorec je starejši sintetizator govora, ki temelji na difonski

sintezi. Prva različica je nastala v letu 1998, kasneje pa so se

posamezni moduli nadgrajevali. Difonska sinteza govora zveni

6. LITERATURA IN VIRI

precej manj naravno od korpusne [5, 6]. Vebuje dva moška glasova:

[1] Šef, T., 2015. Univerzalni govorni e-bralnik za slovenski jezik

Renato Govorec in Matej Govorec. Na platformi Android sta oba

kot osebni učni pripomoček za ljudi z disleksijo in različnimi

glasova dosegljiva tako preko aplikacije eBralec KSS kot preko

vrstami motnje vida, Vzgoja in izobraževanje v informacijski

aplikacije DysLex v »on-line« načinu delovanja. Oba glasova sta

družbi - VIVID 2015 : zbornik referatov, Fakulteta za

priložena tudi novemu sintetizatorju govora eBralec.

organizacijske vede, str. 510-519, 2015.

E-govorec

(http://dis.ijs.si/e-govorec/)

ponudnikom

najraz-

[2] Bravo, društvo za pomoč otrokom in mladostnikom s

ličnejših e-vsebin omogoča dinamično podajanje informacij v

specifičnimi učnimi težavami. (http://www.drustvo-bravo.si).

govorni obliki ter v domačem slovenskem jeziku. Po kvaliteti oz.

[3] Taylor, P., 2009. Text-to-Speech Synthesis, Cambridge

naravnosti govora je enak Govorcu. Gre za brezplačno storitev, ki

University Press.

jo lahko uporablja kdorkoli. Do njega dostopa tudi aplikacija

DysLex.

[4] Žganec Gros, J., Vesnicer, B.,Rozman, S., Holozan, P., Šef,

T., 2016. Sintetizator govora za slovenščino eBralec, Zbornik

eBralec off-line je kompakten sintetizator slovenskega govora, ki

konference Jezikovne tehnologije in digitalna humanistika,

teče neposredno na mobilni napravi. Temelji na difonski sintezi. Pri

Filozofska fakulteta, Univerza v Ljubljani, 2016.

njegovem razvoju je bil poudarek predvsem na odzivnosti ter

majhni potrebi po računski moči in pomnilniku. Posledica je

[5] Šef, T., Gams, M., 2003. Speaker (GOVOREC): a complete

občutno nižja kvaliteta oz. naravnost govora kot pri eBralcu.

Slovenian text-to speech system . International journal of

Njegova prednost je, da za svoje delovanje na mobilnih napravah

Speech Technology, Kluwer Academic Publishers, št. 6, str.

ne potrebuje podatkovne povezave. Vsebuje moški glas Aleš. Na

277-287, 2003.

platformi Android ga v »off-line« načinu uporabljata tako

[6] Šef, T. 2001. Analiza besedila v postopku sinteze slovenskega

aplikacija eBralec KSS kot DysLex.

govora. Doktorska disertacija, Fakulteta za računalništvo in

informatiko, Univerza v Ljubljani, 2001.



51





Showing the Knee of a 4-D Pareto Front Approximation via

Different Visualization Methods

Tea Tušar

Bogdan Filipič

Department of Intelligent Systems

Department of Intelligent Systems

Jožef Stefan Institute

Jožef Stefan Institute

Ljubljana, Slovenia

Ljubljana, Slovenia

tea.tusar@ijs.si

bogdan.filipic@ijs.si

ABSTRACT

50

When decision makers select one or more trade-off solu-

tions to a multiobjective optimization problem, they are

40

mostly interested in solutions residing at knees—regions of

the Pareto front where a small improvement in one objective

leads to a large deterioration in at least one other objective.

30

It is therefore important to be able to detect such regions,

f 2

preferably through visualization. This paper presents visu-

alizations of Pareto front approximations of a multiobjective

20

problem with knees. More specifically, we show how a sam-

pled Pareto front of a four-objective problem with a single

10

knee looks like when visualized using seven different meth-

ods (scatter plot matrix, bubble chart, parallel coordinates,

radial coordinate visualization, level diagrams, hyper-radial

0

0

10

20

30

40

50

visualization and prosections). We can observe that while

f1

the first four methods cannot visualize the knee, the remain-

ing three are able to do so and are therefore more suitable

for use in visualization of Pareto front approximations.

Figure 1: The DEB2DK problem with a single knee

(solutions near the knee are shown in red).

1.

INTRODUCTION

When inspecting a set of solutions to a multiobjective opti-

mization problem, decision makers usually prefer solutions

50

lying at knees—regions of the Pareto front where a small

improvement in one objective leads to a large deterioration

40

in at least one other objective [2]. Visualizing knee regions

in a clear and concise way is therefore a crucial requirement

30

f

for supporting the decision making process. Finding a good

3

20

representation of the Pareto front and its knees is rather

straightforward in the case of two or three objectives (see

10

Figures 1 and 2 for two examples)1, but very challenging

when the number of objectives is four or more [8].

0

10

20

30

40

This paper presents visualizations of the single-knee instance

50 0

10

20

30

40

50

f1

of the DEB4DK problem [2, 7] using seven different meth-

f2

ods, namely the scatter plot matrix, bubble chart, parallel

coordinates [5], radial coordinate visualization [4], level dia-

Figure 2: The DEB3DK problem with a single knee

grams [1], hyper-radial visualization [3] and prosections [8].

(solutions near the knee are shown in red).

All these methods have been previously used to visualize

Pareto front approximations and have been analyzed with

respect to some desired properties for visualization meth-

ods [8].

2.

THE DEB4DK PROBLEM

Branke et al. introduced a family of optimization problems

Section 2 provides the formal definition of the DEB4DK

with knees that are based on the DTLZ benchmark problems

problem, while the visualizations with different methods are

and scalable to any number of objectives [2]. While only

shown in Section 3. Some concluding remarks are presented

the instances with two and three objectives were presented

in Section 4.

originally, they were scaled to four and five objectives in [7].

1Minimization in all objectives is assumed throughout this

paper.

Let us recall the formal definition of the DEB4DK problem

52

with four objectives:

60

60

60

π



π



π



f1(x) = g(x)r(x) sin

x1 sin

x2 sin

x3

2

2

2

40

40

40

π



π



π



f 2

f 3

f 4

f2(x) = g(x)r(x) sin

x1 sin

x2 cos

x3

20

20

20

2

2

2

π



π



f3(x) = g(x)r(x) sin

x1 cos

x2

0

0

0

2

2

0

20

40

60

0

20

40

60

0

20

40

60

π



f

f

f

f

1

1

1

4(x) = g(x)r(x) cos

x1

60

60

2

n

9

X

40

40

g(x) = 1 +

xi

n − 1

f 3

f 4

i=2

20

20

r1(x1) + r2(x2) + r3(x3)

r(x) =

3

0

0

0

20

40

60

0

20

40

60

3 cos(2Kπxi)

r

f2

f2

i(xi) = 5 + 10(xi − 0.5)2 +

K

60

x ∈ [0, 1]n.

40

f 4

20

Here, n denotes the dimensionality of the decision space,

while parameter K is used to control the number of knees

0

on the Pareto front.

A DEBmDK optimization problem

0

20

40

60

with m objectives has Km−1 knees. This means that the

f3

four-objective DEB4DK problem can be instantiated to have

1, 8, 27, . . . knees (for K = 1, 2, 3, . . . respectively). In the

Figure 3: Scatter plot matrix of 300 Pareto-optimal

rest of the paper, the value K = 1 yielding the Pareto front

solutions of the DEB4DK problem with a single knee

with a single knee will be used.

(solutions near the knee are shown in red).

3.

RESULTS OF DIFFERENT VISUALIZA-

60

TION METHODS

50

This section shows how different methods visualize the single

40

knee of the DEB4DK problem with K = 1.

f

30

3

20

The set of solutions approximating the Pareto front (also

called approximation set) was obtained by sampling the front

10

of the DEB4DK problem with 3000 points. Among them,

0

eight solutions closest to the knee were chosen as the so-

10

lutions “at the knee” to be emphasized in the visualizations

20

(analogous to the red knee solutions emphasized in Figures 1

30

f

and 2). The underlying idea is that a visualization method

1

40

should be capable of showing the knee solutions in a way

60

50

50

40

that makes the knee recognizable to a decision maker.

30

60

20

10

0

f2

Since some methods have difficulties when visualizing a large

number of solutions, only the first 300 solutions were used

Figure 4: Bubble chart showing 300 Pareto-optimal

in such cases. Out of the eight knee solutions contained in

solutions of the DEB4DK problem with a single knee

the original sample, five were retained in this smaller set.

(solutions near the knee are shown in red).

3.1

Scatter Plot Matrix

Let us first view the results of the scatter plot matrix—a

approximation set with 300 solutions. Similarly as before,

matrix of all possible 2-D projections of a multidimensional

this plot shows that the red knee solutions have low values

set of solutions. Figure 3 presents the scatter plot matrix

in all four objectives, but we cannot perceive the presence

for the smaller approximation set containing 300 solutions.

of a knee.

Because of the projection to a 2-D space, most of the in-

formation is lost and the solutions at the knee cannot be

3.3

Parallel Coordinates

differentiated from the rest of the solutions. Even an inter-

Parallel coordinates [5] is a very popular method for visu-

active exploration cannot be of much help in such a case.

alizing multidimensional data that represents objectives as

vertical parallel lines. Each solution is shown as a poly-line

3.2

Bubble chart

intersecting each objective (i.e. coordinate) at its value. In

A bubble chart is a scatter plot in which three objectives

the case of our smaller approximation set (see Figure 5), the

are shown on axes while the fourth objective is represented

knee solutions do not stand out from the rest and therefore

with point size. Figure 4 contains the bubble chart for the

do not give a good idea regarding the shape of the approx-

53

60

f1

50

40

30

20

f

f

4

2

10

0 f1

f2

f3

f4

Figure 5: Parallel coordinates of 300 Pareto-optimal

solutions of the DEB4DK problem with a single knee

(solutions near the knee are shown in red).

imation set. Moreover, the method is very sensitive to the

f3

number of displayed solutions. When this number is small,

it can be very informative and is able to show dominance

Figure 6:

Radial coordinate visualization of 3000

relations between solutions. When, on the other hand, hun-

Pareto-optimal solutions of the DEB4DK problem

dreds or thousands of solutions need to be visualized, the

with a single knee (solutions near the knee are shown

cluttered result causes loss of most of the information.

in red).

3.4

Radial Coordinate Visualization

The radial coordinate visualization (also called RadViz [4])

60

60

places objectives on the circumference of a unit circle and

the solutions inside the circle so that the distance of a solu-

40

40

tion to each of the objectives is proportional to its value in

that objective. For example, solutions with greater values in

20

20

Distance to

Distance to

the ideal point

the ideal point

f1 than in any other objective are placed closer to f1 than

0

0

the other objectives. Figure 6 shows the radial coordinate

0

20

40

60

0

20

40

60

visualization of the entire approximation set with 3000 so-

f1

f2

lutions. Again, the knee solutions cannot be differentiated

60

60

from the rest in this plot.

40

40

3.5

Level Diagrams

20

20

Distance to

Distance to

Level diagrams [1] plot each solution against one of its ob-

the ideal point

the ideal point

0

0

jectives and the (Euclidean) distance to the ideal point. In

0

20

40

60

0

20

40

60

case of four objectives, four such diagrams are produced.

f3

f4

Figure 7 presents the level diagrams for the smaller approx-

imation set and we can clearly see the knee in all of them.

Figure 7: Level diagrams for 300 Pareto-optimal so-

lutions of the DEB4DK problem with a single knee

3.6

Hyper-Radial Visualization

(solutions near the knee are shown in red).

In hyper-radial visualization [3] the solutions are plotted

against their distance to the ideal point (their hyper-radius)

separately for two subsets of objectives. In our case, the

the ideal point would probably be undetectable with these

x axis represents the hyper-radius for objectives f1 and f2,

visualization methods. Showing this is left for future work.

while the y axis represents the hyper-radius for objectives f3

and f4. As shown in Figure 8, the eight knee solutions from

the larger approximation set are clearly recognizable in this

3.7

Prosections

visualization.

Prosections [6, 8] are projections of only a section of the

objective space at a time. For example, a section of width

A note on the level diagrams and the hyper-radial visual-

d is selected at angle ϕ on the plane f1f2. All solutions

ization: while both methods are able to show the knee in

that fall in this section are projected onto the line going

this case, this is mostly due to the fact that the knee has

through the ideal point at angle ϕ while all other solutions

the shortest distance to the ideal point. In case of multi-

are discarded. By showing a 3-D scatter plot with the x

ple knees, for example, the knees that are not very close to

axis representing the projected line and the other two axes

54

solutions is possible, it also requires to plot multiple visual-

40

izations (using various planes and angles) to get a complete

view of an approximation set. Another advantage of this

35

method is that it can easily handle large sets of solutions

because only a part of the set is visualized in a single plot.

30

4.

CONCLUSIONS

This paper presented how seven visualization methods pre-

25

viously used to visualize Pareto front approximations cope

with visualization of knees. Only three methods (level dia-

f 4f

20

f 3f

grams, hyper-radial visualization and prosections) were able

to show the single knee of the four-objective DEB4DK prob-

15

lem. Additional work is needed to see how these visualiza-

tion methods perform on a problem with more knees.

10

5.

ACKNOWLEDGMENTS

This work is part of a project that has received funding from

5

the European Union’s Horizon 2020 research and innovation

program under grant agreement No. 692286. This work was

0

partially funded by the Slovenian Research Agency under

0

5

10

15

20

25

30

35

40

research program P2-0209.

f1

f f

1 2

f

6.

REFERENCES

Figure 8: Hyper-radial visualization of 3000 Pareto-

[1] X. Blasco, J. M. Herrero, J. Sanchis, and M. Mart´ınez.

optimal solutions of the DEB4DK problem with a

A new graphical visualization of n-dimensional Pareto

single knee (solutions near the knee are shown in

front for decision-making in multiobjective

red).

optimization. Information Sciences, 178(20):3908–3924,

2008.

[2] J. Branke, K. Deb, H. Dierolf, and M. Oswald. Finding

60

knees in multi-objective optimization. In Proceedings of

the 8th International Conference on Parallel Problem

50

Solving from Nature, PPSN VIII, Birmingham, UK,

40

volume 3242 of Lecture Notes in Computer Science,

f

30

pages 722–731. Springer, 2004.

4

[3] P.-W. Chiu and C. Bloebaum. Hyper-radial

20

visualization (HRV) method with range-based

10

preferences for multi-objective decision making.

Structural and Multidisciplinary Optimization,

0

5 10 15 20 25 30

40(1–6):97–115, 2010.

35

f

40 0

10 20 30 40 50 60

f

1f2

3

[4] P. E. Hoffman, G. G. Grinstein, K. Marx, I. Grosse, and

E. Stanley. DNA visual and analytic data mining. In

Proceedings of the IEEE Conference on Visualization,

Figure 9: Prosection 4D(0, f1f2, 45◦, 2) showing a part

pages 437–441, Los Alamitos, CA, USA, 1997. IEEE.

of 3000 Pareto-optimal solutions of the DEB4DK

[5] A. Inselberg. Parallel Coordinates: Visual

problem with a single knee (solutions near the knee

Multidimensional Geometry and its Applications.

are shown in red).

Springer, New York, NY, USA, 2009.

[6] T. Tušar and B. Filipič. Visualizing 4D approximation

sets of multiobjective optimizers with prosections. In

representing objectives f3 and f4, we get a 3-D representa-

Proceedings of the Genetic and Evolutionary

tion of this section of the 4-D objective space that can be

Computation Conference, GECCO ’11, Dublin, Ireland,

denoted as 4D(0, f1f2, ϕ, d). Figure 9 shows the prosection

pages 737–744. ACM, 2011.

4D(0, f1f2, 45◦, 2), that is, a prosection on the plane f1f2

using the angle ϕ = 45◦ and section width d = 2. The knee

[7] T. Tušar and B. Filipič. Scaling and visualizing

points are clearly visible in this plot. In theory, prosections

multiobjective optimization test problems with knees.

should be able to visualize any number of knees regardless of

In Proceedings of the 15th International

their position (using multiple plots—each visualizing a dif-

Multiconference Information Society, IS 2012,

ferent section of the space), but experiments to demonstrate

Ljubljana, Slovenia, volume A, pages 155–158,

this are still needed.

Ljubljana, Slovenia, 2012. Jožef Stefan Institute.

[8] T. Tušar and B. Filipič. Visualization of Pareto front

The advantages and disadvantages of prosections originate

approximations in evolutionary multiobjective

from the fact that only a section of the space is visualized at

optimization: A critical review and the prosection

a time. While this means that a visualization of high accu-

method. IEEE Transactions on Evolutionary

racy that mostly preserves the dominance relations among

Computation, 19(2):225–245, 2015.

55





Machine learning method for stress detection with an EEG

device

Martin Gjoreski, Mitja Luštrek, Matjaž Gams

Department of Intel igent Systems, Jožef Stefan Institute

Jožef Stefan International Postgraduate School

Jamova cesta 39, Ljubljana, Slovenia

{martin.gjoreski, mitja.lustrek, matjaz.gams}@ijs.si





ABSTRACT

Electroencephalography (EEG) is a method for monitoring the

We present a study for stress detection using a one-channel

electrical activity of the brain. It reflects the upper cognitive

commercial EEG device. For this purpose, a stress-inducing

functions and mental or psychological states. In general, increased

experiments were performed. 21 subjects were monitored while

activity in the higher frequency bands is correlated with arousal,

performing mental-arithmetic tasks under time and evaluation

thinking or emotional activity. For example, the midrange beta

pressure. The data was used to develop a machine learning

frequency band (16–20 Hz) is considered to be correlated with

method for stress detection which consists of: a segmentation

emotional “intensity” that may come from anxiety [20]. In

phase, a filtering phase, a feature extraction phase and a model-

contrast, a lower spectral activity, e.g., activity in the delta

learning phase. The initial evaluation results are promising and

frequency band (0.1Hz to 3Hz) is correlated with deep sleep.

showcase that person-specific models can perform for a 19

The recent technological advancements have made cheap

percentage points better than the majority class.

commercial EEG devices generally available. Depending on the



quality of the device, they may capture the contrasting sleep-

thinking working modes of the brain or possibly even stress/no-

Categories and Subject Descriptors

stress situations. In this study, we explore the use of a single

J.3 [Computer Applications]: Health

channel EEG device for stress detection in office-like conditions.

Keywords

2. RELATED WORK

Stress detection; EEG device; machine learning; health.

Even though there are numerous studies on stress using

psychology instruments or hormones analysis, the number of

1. INTRODUCTION

stress studies using electroencephalogram (EEG) for stress

Stress is a process triggered by a demanding physical and/or

detection is quite low. This indicates the complexity of the

psychological event [6]. It is not necessarily a negative process,

problem [19]. Furthermore, the use of a commercial one-channel

but when present continuously it can result in chronical stress,

device makes the problem of stress detection even more

which has negative health consequences such as raised blood

challenging.

pressure, bad sleep, increased vulnerability to infections, slower

Peng et al. analyzed methods for identifying chronic stress by a

body recovery and decreased mental performance [7]. Regarding

medical 3-chnnel EEG device. They used statistical analysis to

the economic costs of stress, the European Commission projected

distinguish between relaxed state and chronical stress. They have

the costs of work-related stress to €25 billion a year. This is

concluded that there are different hemispheric activities between

because work-related stress leads to increased number of injuries,

left and right hemispheres for the no-stress and stress subjects.

increased absenteeism and decreased productivity [3]. Therefore,

Similar findings were presented by Giannakakis et al. [19] who

a stress-detection system would be useful for self-management of

were detecting stress in subjects while watching videos using a 32

mental (and consequently physical) health of workers [5], students

channel EEG device. They have asserted that EEG response to

and others in the stressful environment of today’s world.

positive valence stimuli is left frontal activity, while negative

To develop a stress-detection method, we must better understand

valence stimuli cause an increase to right activity. However, these

the stress process. When humans undergo a demanding physical

changes cannot be captured with a single channel EEG device.

and/or psychological event (e.g., meeting, exam, etc.), the body is

Recently, Saeed et al. [21] presented a study for stress detection

faced with a large physiological stressor which invokes a response

using a one channel EEG device. They tried to distinguish

of the sympathetic nervous system to meet the increased metabolic

between high and low stress based on a 3 minute data labelled

demands [1]. The sympathetic nervous system speeds up certain

using a Perceived Stress Scale questionnaire. This questionnaire

processes within the body (“fight-or-flight” response) [2]: it raises

gives information about stress levels accumulated over longer

the heart rate, raises sweating rate, increases blood pressure, etc.

period of time since it contains questions such as: “In the last

All of these changes are controlled by the brain, thus it is expected

month how many times …”.

for the brain’s activity to be changed in a stressful situation. In

addition, it has been reported that there are neuro-chemical

In our study, stress is induced using a standardized method, and it

changes due to stress response [10].

is assessed using State-Trait Anxiety Inventory (STAI)

questionnaire, which provides momentary information. Thus, we

56





are trying to detect different stress levels over a short period of

Table 1. Mean STAI score for the experimental data.

time (e.g., 25 minutes).

Type

Before

Easy

Medium

Hard



In the literature there are other approaches for stress detection

using wearable devices which capture physiological signals such

STAI score

10.95

13.33

14.05

13.81



as heart rate, sweating rate, R-R intervals, skin temperature and



similar [22][23][24]. However, these approaches are out of the

For each subject, the period before answering the STAI

scope of this study.

questionnaire in which they achieved the lowest score is labelled

as low stress, and for each +3 STAI points (the statistical tests

3. DATA

showed that difference of 2.38 is enough), the stress label is

A standardized stress-inducing method was used for collecting the

increased by one, thus we get low stress (lowest STAI score),

experimental data. A web application was developed in

medium stress (lowest STAI score +3) and high stress (lowest

collaboration with psychologists, which implements a variation of

STAI score +6). In the final experiments the medium and high

the stress-inducing method presented by Dedovic et al. [4]. The

stress were merged because only two subjects achieved a high

main stressor is solving a mental arithmetic task under time and

level of stress, so we had two degrees of stress low and high. The

evaluation pressure. In short, a series of randomly generated

overall duration of the low stress data was 840 minutes and the

equations were presented to subjects, who provide answers

duration of the high stress events was 724 minutes.

verbally. The time given per equation was dynamically changing.

For each two consecutive correct answers the time was shortened

4. METHOD

by 10%, and for each two consecutive wrong answers the time

The proposed method is a machine-learning pipeline presented in

was increased by 10%. Each session consisted of three series of

Figure 2. During the experiments, the data was streamed to a

equations with increasing difficulty: easy, medium and hard. Each

smartphone via Bluetooth, then transferred to a computer where

series of equations lasted for five minutes. For motivation, a

the rest of the processing was performed. The method contains

reward was promised to the top three participants. After each

several phases: filtering, segmentation, feature extraction and

stage, the participant was shown false ranking score, positioning

model learning. In the next subsections, each stage is described in

him/her in the top five, and this way motivating him/her to try

details.

harder in the next stage and try to win the award. The application

is available on-line: http://dis.ijs.si/thestest/.

During the experiments, the subjects were monitored with a

NeuroSky's MindSet as presented in Figure 1. It is a wearable

EEG that provides raw EEG data with a 512Hz sampling

frequency by an electrode placed on left side of the forehead (at

the FP1 location). In addition, the device provides computed 1Hz

data for 7 frequency sub-bands, information for meditation

intensity, information for attention intensity and information for

detected blinks.



Figure 2. The proposed method for stress detection from EEG.

4.1 Filtering and Segmentation

After a visual inspection of the data, it was obvious that the data

contains noisy segments which needed filtering. For detecting

outliers (noise) in the data we used Tukey’s method of leveraging

the interquartile range [11]. This method is independent of

distributional assumptions and it ignores the mean and standard

deviation, thus it is resistant to extreme values in the range. After

the detection, each outlier is replaced with the average value of

the two neighboring samples. Figure 3 presents an example data



of one subject. On the x-axis is the time in seconds and on the y-

Figure 1. Experimental setup. Subject 1 performing the stress

axis is the amplitude of the high beta band (20-30 Hz). The

inducing experiment.

bottom graph is the raw data as provided by the device and the top

graph is the same data after filtering.

Four Short STAI-Y anxiety questionnaires [8] were filled by each

participant: before the experiment (1), and after the easy (2),

medium (3) and hard session (4). The STAI questionnaire reflects

the degree to which the subjects have a feeling of unease, worry,

tension, and stress. Its value range is from 0 to 24, with 24 being

the highest level. The mean STAI score is presented in Table 1.

We performed statistical analysis using repeated measures

ANOVA [9]. The resulting p-value was 0.0014, confirming that

there is a statistically significant difference in the answers for the



intensity of stress. The answers of the STAI questionnaire were

Figure 3. Data of one experimental session for Subject 3. A

used for subject-specific labelling of the data.

raw signal (bottom) and the same signal after filtering (top).

57

4.2 Feature extraction

extracted come from one source, the raw EEG data. Naïve Byes is

In the feature extraction phase, a sliding-window technique was

well known for performing poorly on correlated datasets due to its

used. From each window, numerical features were calculated

“naïve” assumption that features are not correlated. On the other

using statistical functions (mean, variability, quartiles, quartile

hand, the best performing classifier is Random Forest. This is due

deviation and power) and regression analysis (intercept and slope

to its noise robustness. The highest achieved accuracy is 71%

of a fitted regression line). The features were computed from each

which is for 19 percentage points better the majority classifier.

data type provided by the device (raw EEG data downsampled to

Table 2. Accuracy (%) for each of the ML methods for a

4Hz because the high frequency does not influence significantly

varying size of the feature extraction window.

the computed features, frequency bands’ data, meditation data,

Feature-extraction

attention data and blinks data). In addition, blinks per second,

significant blinks per seconds and power of significant blinks was

ML

window (minutes)

computed from the blinks data. A blink was considered significant

Classifier

0.5

1

2

4

if its value was over some threshold (80), which was chosen

Majority

52

52

52

52

experimentally.

Naïve Bayes

55

53

53

54

4.3 Model learning

Boosting

57

57

60

55

For the model learning we used the machine learning algorithms

J48

60

63

57

56

as implemented in the WEKA machine learning toolkit [25]. We

KNN (3)

62

61

64

66

experimented with variety of ML algorithms:

SMO

64

64

64

63

•

Majority classifier – always predicts the majority label.

Bagging

66

65

66

63

This is a baseline model.

Random Forest

70

71

71

66

•

J48 – algorithms for building a decision tree (DT) [12].

6. CONCLUSION AND DISCUSSION

•

Naïve Bayes classifier – algorithm for building simple

We presented a study for stress detection with a one-channel

probabilistic classifiers based on Bayes' theorem with strong

commercial EEG device. For this purpose, a stress-inducing

independence assumptions between the features [13].

experiments were performed allowing us to gather EEG data for

21-subjects. This data can be utilized for studying the brain

•

KNN – simple algorithm which provides a prediction based

activity under stress, induced using time and evaluation pressure.

on k-nearest instances (from the training set) to the test

instance 0.

In addition, we presented a machine learning method for stress

detection including segmentation technique, filtering technique

•

SVM – algorithm for building a classifier where the

and a feature extraction technique. The initial evaluation results

classification function is a hyperplane in the feature space

are promising, but still there is a room for improvements.

[15].

An inter-study comparison with the related work is not directly

applicable mainly due to the use of different devices, different

•

Bagging – ensemble algorithm which learns base models on

subsets of train data with the purpose of reducing variance

datasets, different number of subjects etc. As an example, Vanitha

and avoiding overfitting [16]. The algorithm for learning

et al. [26] presented a system for real time stress detection based

base models was J48.

on 14-channel EEG device. The accuracy of the proposed

machine learning models varied form 70-90%. This clearly

•

Boosting – ensemble algorithm which learns models on

suggests that better results can be achieved with EEG devices that

subsets of the train data and “boosts” the weights of

provide data from more channels.

misrecognized instances allowing for the models in the

The evaluation technique used in the experiments is a 10-fold

ensemble to focus on the misclassified instances [17]. The

cross-validation which is person-depended evaluation approach.

algorithm for learning base models was J48.

We also experimented with a person-independed evaluation

•

Random Forest – ensemble method which learns base

technique (leave-one-subject-out cross-validation) and the

decision trees by subsetting the feature set [18].

performance of the models (in terms of accuracy) was near the

majority classifier. There may be several reasons for this:

5. EXPERIMENTAL RESULTS

•

Subject specificity. The EEG data may be subject-specific

We performed experiments with a varying size of the sliding

because each person has a specific reaction to such stressor.

window used in the feature extraction phase (size of the feature-

The specific reaction may not allow to build general ML

extraction window). We experimented with a window size of 0.5,

model trained on one set of subjects and tested on another

1, 2 and 4 minutes. For evaluation we used 10-fold cross-

set of subjects.

validation technique. The results are presented in Table 2. Each

•

Noise. The commercial EEG device may not be powerful

row represents the accuracy achieved by the specific classifier and

enough to capture some of the subtleties which would allow

each column represent the size of the feature-extraction window.

building general ML model. Each recorded session may be

The results show that the feature-extraction window size does not

different due to persistent noise in the data.

influence significantly the performance of any of the classifiers.

The worst performing classifier is Naïve Bayes. This is due to the

•

Advanced features. The extracted features in this study are

correlation of the features in the dataset. A high correlation is

based on statistical and regression analysis. In future, more

expected since all of the signals from which the features are

advanced features can be extracted to improve the models’

performance and generalization.

58

7. REFERENCES





[1] A. Angeli, M.minetto, A. Dovio, P. Paccoti, “The

[14] D. Aha, D. Kibler, “Instance-based learning algorithms,”

overtraining syndrome in athletes: A stress-related disorder”,

Machine Learning Vol. 6, pp. 37–66, 1991.

Journal of Endocrinological Investigation, 2004, pp 603-612.

[15] N. Cristianini, J. Shawe-Taylor, “An Introduction to Support

[2] W. B. Cannon. The Wisdom of the Body. 1932.

Vector Machines and Other Kernel-Based Learning

[3] Calculating the cost of work-related stress and psychosocial

Methods,” Cambridge University Press: Cambridge, 2000.

risks European Risk Observatory Literature Review [Online].

[16] L. Breiman, “Technical Report No. 421,” 1994.

Available: https://osha.europa.eu/en/tools-and

[17] K. Michael, “Thoughts on Hypothesis Boosting,”

publications/publications/literature_reviews/calculating-the-

Unpublished manuscript, December 1988.

cost-of-work-related-stress-and-psychosocial-risks,

[Accessed 15.10.2014].

[18] H. Tin Kam, “Random Decision Forests,” International

Conference on Document Analysis and Recognition,

[4] K. Dedovic, R. Renwick, N. K. Mahani, and V. Engert, “The

Montreal, QC, Canada, pp. 278–282, 1995.

Montreal Imaging Stress Task : using functional imaging to

investigate the ...,” vol. 30, no. 5, pp. 319–325, 2005.

[19] G. Giannakakis, D. Grigoriadis, M. Tsiknakis, “Detection of

stress/anxiety state from EEG features during video

[5] Fit4Work Project, http://www.fit4work-aal.eu/

watching,” IEEE Eng Med Biol Soc. pp. 6034-7, 2015.

[6] H. G. Ice, G. D. James. Measuring Stress in Humans: A

[20] P.M. Lehrer, R.L. Woolfolk, W.E. Sime. “Principles and

practical Guide for the field,” Cambridge university press.

practice of stress management,” Guilford Press, 2007.

2007.

[21] S. Saeed, S. M. Anwar, M. Majid, A. Bhatti, “Psychological

[7] S.C. Segerstrom, G.E. Miller. Psychological stress and the

Stress Measurement Using Low Cost Single Channel EEG

human immune system: a meta-analytic study of 30 years of

Headset,” International Symposium on Signal Processing and

inquiry. Psychological Bulletin 130, 2004, 601

Information Technology, 2015.

[8] C. D. Spielberger. State-Trait Anxiety Inventory Anxiety.

[22] M. Gjoreski. “Continuous Stress Monitoring Using a Wrist

vol. 19. p. 2009, 1987.

Device and a Smartphone,” Master thesis, Jozef Stefan

[9] T. Eftimov, P. Korošec, B. Koroušic Seljak. Disadvantages

International Postgraduate School, Slovenia, 2016.

of Statistical Comparison Of Stochastic Optimization

[23] J. A. Healey, R. W. Picard, “Detecting Stress During Real-

Algorithms. Proceedings of the Bioinspired Optimizaiton

World Driving Tasks Using Physiological Sensors,” IEEE

Methods and their Applications, BIOMA 2016.

Transactions on Intelligent Transportation Systems, 2005.

[10] McEwen, Bruce S., and John H. Morrison. "The brain on

[24] K. Hovsepian, M. Absi, T. Kamarck, and M. Nakajima,

stress: vulnerability and plasticity of the prefrontal cortex

“cStress : Towards a Gold Standard for Continuous Stress

over the life course." Neuron 79.1 (2013): 16-29.

Assessment in the Mobile Environment,” ACM Conf. on

[11] J.W. Tukey. “Exploratory data analysis.” Addison-Wesely,

Ubiquitous Computing, 2015.

1977.

[25] M. Hall, et al. “The WEKA Data Mining Software: An

[12] J.R. Quinlan, “Improved use of continuous attributes in

Update,” SIGKDD Explorations, Vol 11, Issue 1, 2009.

c4.5,” J. Artif. Intell. Res., Vol. 4, pp. 77–90, 1996.

[26] V. Vanitha, P. Krishnan, “Real time stress detection system

[13] R. Stuart, N. Peter, “Artificial Intelligence: A Modern

based on EEG signals.” Biomedical Research, pp. 271-275,

Approach,” 2nd Ed., Prentice Hall: New York, USA, 2003.

2016.





59





DESIGN OF MOBILE SYSTEMS THAT ENABLE AN ACTIVE

AND MEANINGFUL LIFE FOR DEMENTIA PATIENTS



David A. Fabjan

Ambient Intelligence Research Group

Department of Intelligent Systems

Jožef Stefan Institute

Jamova cesta 39, 1000 Ljubljana, Slovenia

Tel: +386 1 477 3865; fax: +386 1 4251038

E-mail: david.fabjan@ijs.si



ABSTRACT



Cognitive impairment may be determined by multiple

The paper presents an outlook on two expert mobile health

neuropathological processes including neurodegenerative

systems, proposed for the benefits of prolonged and

diseases, by far the most common being Alzheimer disease,

meaningful living for people in the early stages of

and vascular disease. In fact, most frequently, cognitive

Dementia.

impairment is the result of more than one disease process



(degenerative and vascular) [1]. Cognitive impairment

The research group has recently proposed such prototype

represents a progressive process that gets worse over time

systems to be built and hopefully used commercially. Both

and has no foreseen completely effective cure.

systems help people with mild cognitive impairments



serving as memory aid in one case, and as a cognitive

The main goal for people affected by Dementia is the ability

training tool in the second case, which are both presented in

to maintain an acceptable level of autonomy that would

this exploratory paper.

allow them to live a quality life at home, despite the



limitations imposed by the condition. Similarly, family

The main goal of proposed prototypes is to enable and

caregivers and care professionals of nursing homes and day

maintain an acceptable level of autonomy that would allow

care facilities are in need of technological tools to help the

affected people to live an active life, despite the limitations

people with Dementia develop a coping strategy concerning

imposed by the condition.

their illness.





Keywords

The quality of life is associated with health, social

Mobile systems, Dementia, cognitive impairments, ambient

relationships and ability of using resources such as shops,

sensors, context aware systems, prediction, wearables, care

transportation and housing [3]. Those people suffering from

giving.

dementia score poorly on all these criteria; they are



frequently ill, have difficulties in maintaining social

1 INTRODUCTION

relationships, and are unable to utilise the resources at their



disposal. To improve the quality of life, people with

Dementia is one of the most frequent neurological disorders

dementia need to maintain most of their personal

affecting the elderly population. It is estimated that in 2015,

independence, engage in social interaction, and must keep

there were 46.8 million people living with dementia

themselves occupied [4].

worldwide, with one new case every 3 seconds [1]. A



disproportionally large number of them – 10.5 million –

These factors are interrelated and there is no consensus

live in Europe, since it has much older population on

about their relative importance, although some literature

average than other continents.

suggests that independence in the activities of daily living is



the most important one [5]. To address these important

This spectrum of diseases with common name Dementia is

factors, the research group in collaboration with the

characterized by a decline in one or more of the seven

international team of medical and social experts on

cognitive domains (learning, memory, language, executive

Dementia and specialists in “serious games” and human-

function, attention, and perceptual-motor) from the previous

computer interaction, proposed two intelligent mobile

(normal, functioning) level of functionality. Such decline is

systems.

severe enough to interfere with normal daily activities and



leads to personal dependence on external help even when

2 SMART COACH

performing basic tasks such as hygiene and daily diet.





First system is re-hashing an old idea of using short

reminder notes (post-it, stickers), strategically placed to



60



remind us of important or frequent tasks. Generally, external

learning, sensor fusion and heuristics to infer information

memory aids are the most frequent strategy for dealing with

such as the user’s location (in the kitchen, on the sofa) and

the memory loss caused by Dementia [6]. Reminder notes

activity (cooking, resting). Most methods at this stage will

are usually placed in various locations around the home –

run on the sensing devices (computer vision and sound

those for meal preparation in the kitchen, for personal

analysis on the tablet) to reduce the communication

hygiene in the bathroom, etc.

overhead.





However, the reminder notes should also depend on what

The second stage will create a global awareness of the user’s

the person is currently doing: while putting clothes in the

situation involving all the sensing devices. Machine learning

washing machine - on how to operate it, on how to use

and sensor fusion are used to recognize only parts of the

shampoo; when going to the kitchen in the morning - about

user’s situation. Next step will involve decision-support

making coffee, about drinking water, about taking medicine,

methods such as proprietary hierarchical qualitative decision

etc. If one wanted to have a paper note for every occasion,

models and expert defined rules to completely determine the

though, they would become too numerous to be of use and it

user’s situation and the most probable intent. Algorithms for

would become very confusing for the individual, particularly

these methods will run remotely in the cloud.

when trying to follow certain timeline of tasks.





The third stage will use symbolic reasoning based on an

The system intends to develop intelligent reminder notes

ontology and rules to select the most appropriate reminders

that are context-sensitive. To identify the activity of the user,

and instructions based on the user’s situation evaluated and

we have to use various types of ambient sensors to detect

inferred in the second stage. Methods in this stage will be

what the person is doing. In addition, we will be able to

also implemented in the cloud, returning the output to the

utilize a pre-programmed schedule of the person’s normal

display devices.

and procedural daily activities (eating, personal hygiene,



exercises) as well as learn individual’s personal habits.



This combined can give us a very good idea about what

exactly the person is doing and what the person wants or

needs to accomplish at a given moment. This way we will be

able to display just the right reminder with the correct steps

to follow at the appropriate time.



2.1 TECHNOLOGY AND METHODOLOGY



The farfetched technology for the system would be smart

glasses or even smart contact lenses that would sense and



analyse the user’s environment, interpret context and intent,



and provide the appropriate reminders or procedural

Figure 1. The Architecture of the smart coach system

instructions directly in front of the eyes [7]. From proven



technology, available in the market, the following options

2.2 EXPECTED IMPACT

were selected:





Each person with dementia living at home is cared for by

Display: Tablets mounted on strategic locations in the user’s

two or three carers on average. With the growing number of

home and smart glasses, almost indistinguishable from

dementia cases, this “dependency ratio” is difficult to

regular glasses;

sustain, resulting in a high demand for care solutions, and



representing a growing burden in terms of economic, social

Sensing: Camera and microphone in the tablet, we use a

and human costs.

camera and perform vision and sound analysis, intended to



detect the presence of the user, basic activities such as

The main impact of the expert mobile system is to prolong

handling dishes, and environmental events; 3D sensing

the independence and living at home for people with mild

technology similar to Microsoft Kinect; a wristband with

and moderate dementia. We want to extend such living for at

multiple sensors for user’s activities recognition; RFID

least two years before constant care and living in nursing

reader that can detect tags attached to various important

homes is needed.

objects in the user’s home; localisation system using



Bluetooth Low Energy beacons to detect the exact user’s

Staying in home environments has the smallest socio-

location.

economic cost, which is important due to demographic



changes where the population is getting older.

The interpretation and reasoning from the data acquired will



be done in three stages. The first stage will use machine





61





3 PLAN AND EXECUTE DAILY ACTIVITIES

professionals will have a library of learning modules and



games that will simplify their job in training the users to live

The second system proposed has the objective of cognitive

well with dementia; the families of the person with dementia

training [8] to exercise the remaining cognitive abilities of

will be able to understand those activities in which their

people with mild cognitive impairment and moderate forms

relatives find more difficulties to help them with the coping

of Dementia to help them remain independent as long as

strategy.

possible.





3.1 TECHNOLOGY AND METHODOLOGY

While cognitive training cannot fully reverse the damage



caused by the disease, it can slow it down considerably and

The three main components of the prototype are a serious

even improve the abilities in specific areas. Research shows

game for cognitive training of executive function, a mobile

[3] that cognitive training is particularly effective when it is

device companion to track the progress of the execution of

personalized, taking into account the person’s specific

the trained activities in real life, e-learning methods to train

situation and goals – this type of cognitive training is

the users (people with Dementia and their carers) in the use

sometimes called cognitive rehabilitation. The training can

of the system, and a back-end to connect all the components.

address various cognitive functions: memory, attention,



executive function and others. While there are several



applications available for cognitive training of memory and

attention, the executive function has been neglected, which

is exactly what this system is aiming to address.



Executive function is the ability to execute tasks that involve

planning and decision making. It helps us deal with the

procedural tasks necessary to perform required set of

activities in daily life that are not well rehearsed, such as

shopping, preparing meals and visiting a doctor.



The core of the system developed in this project will be a

serious game that will simulate such tasks, including various



mishaps that can occur (forgetting a wallet while going

Figure 3. The Architecture of the 2nd prototype system

shopping, the shop being out of an ingredient for the planned



meal, taking a wrong bus when going to the doctor). This

Back-end of the system supports the previous three

will enable the users to train and prepare themselves for

components with: serious game scenarios defined during the

these tasks in the safety of their own homes on a tablet or on

gathering of user requirements; serious game configuration

a TV without the stress brought about by mishaps in the real

input by the care giver; user medical and psychological

world.

profile, preferences and the history of the use of the system;



a module for advanced data analysis for the serious game



executions in terms of personalisation, planning and analysis

of the progress.



E-learning materials are using an existing technology

OpenEDX. The OpenEDX will be extended with algorithms

to render the learning path of the users dynamic and to

consider their preferences, physiological values and

behaviour.





3.2 EXPECTED IMPACT



Although a technological intervention cannot replace live

social interaction, the system is expected to have a positive



personal impact on the affected person’s interaction with



friends and care givers.

Figure 2. The concept of the serious game system





The system should increase a quality of life and ease the

People with Dementia will have an effective way to evaluate

burden of family and care givers. The care givers will be

which exercises and e-learning modules allow them to

able to interact with the users remotely trough the cognitive-

perform their activities better; family carers and care



62



training game in a playful manner, re-acting the scenarios in

support sophisticated machine learning algorithms, devices

home environments.

such as smartphones, can build predictive models of the



context. Based on these models, actionable decisions that

4 INVOLVEMENT OF THE END USERS

impact the future state of the system and/or its environment



can be made. [9]

The end-users involvement and systems adaptation to their



needs is crucial for the success of the proposed systems. The

6 ACKNOWLEDGMENT

target groups involve people with mild to moderate



dementia, formal carers such as medical, pharmaceutical and

This work would not be possible without the insights and

outpatient services and care givers from dementia

thoughtful discussions with Dr. Mitja Luštrek, Head of the

institutions, and private subjects, and informal care givers.

Ambient Intelligence Research Group and Božidara



Cvetković, Department of Intelligence Systems at JSI. Both

We will investigate the requirements of the target groups,

have contributed with excellent knowledge on recent

primary, secondary and tertiary end users. User requirements

advances in the domain of ambient intelligence and

will be collected separately for each of the implied groups,

possibilities of mobile computing. I would like to thank

as the motivations, needs and attitude towards the proposed

Martin Frešer, the young researcher in the group, for

technology will vary.

helping to finalize the prototypes’ design.





The testing follows three principal objectives including:

Special thanks go to all the external experts for all their

exploratory studies regarding user-requirements (both

invaluable advice and their contribution. I especially wish

primary users and formal and informal care givers); input

to express my gratitude for contribution of Slovene experts:

and feedback in the development process, including the

prof. Zvezdan Pirtošek, Head of Neurology Division at

review of testing of prototypes in pilot studies; and finally

University Medical Centre Ljubljana, and Štefanija Lukič

the retrieval of input for the further development and

Zlobec, the President of the Slovenian Association for Help

potential commercialization.

with Dementia – Spominčica, Alzheimer Slovenia.





7 REFERENCES



[1] Alzheimer's disease International. World Alzheimer

Report 2015: The Global Impact of Dementia. (2015)

[2] Sonnen JA, Larson EB, Crane PK, et al. Pathological

correlates of dementia in a longitudinal, population-

based sample of aging. Ann Neurol. 2007 Oct.

62(4):406-13

[3] Keating N, Gaudet N. Quality of life of persons with

dementia. The Journal of Nutrition, Health & Aging





16(5). (2012)

Figure 4. The involvement of the end-users

[4] Moyle W, Fetherstonhaugh D, Greben M, Beattie E.



Influencers on quality of life as reported by people



living with dementia in long-term care: A descriptive

Testing is an inclusive, user-centred approach. The

exploratory approach. BMC Geriatrics 15. (2015)

involvement of end-users shall be iterative through multiple

[5] Andersen CK, Wittrup-Jensen KU, Lolk A, Andersen

cycles of design, testing and adaptation of a prototype.

K, Kragh-Sørensen P. Ability to perform activities of



daily living is the main factor affecting quality of life in

5 CONCLUSION

patients with dementia. Health and Quality of Life



Outcomes 2(52). (2004)

In this exploratory paper we provide an insight into the parts

[6] Beard RL, Knauss J, Moyer D. Managing disability and

of past year’s research and work by the ambient intelligence

enjoying life: How we reframe dementia through

group. These ideas and results are possible when combining

personal narratives. Journal of Aging Studies 23(4).

technology and methods with initiatives and in-depth

(2009)

knowledge by the experts form health and social domains.

[7] Web

page

https://www.wareable.com/wearable-



tech/best-smart-contact-lens (September, 2016) Mobile pervasive systems and devices have seen tremendous

[8] 5 Bennett DA, Schneider JA, Arvanitakis Z, et al.

advances in recent decades. Ambient intelligence is

Neuropathology of older persons without cognitive

becoming more and more pervasive with every new

impairment

from

two

community-based

studies.

generation of sensors with the added modalities and

Neurology. 2006 Jun 27. 66(12):1837-44

advances in miniaturisation processes. Equipped with an

[9] Pejovic V. Roadmap to Anticipatory Mobile Computing,

array of sensors and powerful processing hardware that can

Elektrotehniški vestnik 81(5): 247–250, 2014



63





SeaAbs: developing a low cost seafloor observatory

Aleksandar Tosić

Matjaž Kljun

University of Primorska, FAMNIT

University of Primorska, FAMNIT

Glagoljaška 8

Glagoljaška 8

Koper, Slovenia

Koper, Slovenia

aleksandar.tosic@upr.si

matjaz.kljun@upr.si

ABSTRACT

Two of the key challenges of seafloor observatories are con-

Emphasis on ocean science in the past two decades has led

stant communication capabilities and power requirements.

to an increased demand for maritime observation and data

The most widely used approach today is to have a network

collection.

Such data is commonly collected by specially

of cables put in place [7, 3] in order to connect observato-

designed maritime observatories which are used by various

ries to internet and power them. This makes such solution

stakeholders (e.g. civil society, NGO, government agencies,

relatively expensive for general use [4]. Another approach

policy makers, researchers) for various purposes (e.g. en-

to this problem is a buoy based design [2], which is usually

vironmental protection, development of sea infrastructure,

making use of satellite transmission. However, such solu-

bolstering tourism, climate change prediction). Most seafloor

tions can limit the possible placement location greatly and

explorations have so far been completed either using re-

there is still a power supply in question. The later is also

motely operated underwater vehicles (ROV’s) or static ob-

true for all acoustic based communication systems.

servatories placed on the sea floor. The latter are arguably

better suited for observing the dynamics of the oceans for

longer periods of time. A lot of effort has also been invested

into developing of on-line communication capabilities rang-

ing from optic cables to underwater acoustic wireless com-

munication systems. However, such systems demand large

monetary investment, which is often an obstacle especially

for above mentioned civil society, NGOs and researchers. In

pursuit to make seafloor observation affordable for a larger

society and to progress the ocean science we have built a

seafloor observatory as an open-source platform aiming to

increase and speed-up sea-floor exploration, data gathering

and data sharing.

Categories and Subject Descriptors

C.2.3 [Computer systems organization]: Embedded and

cyber-physical systems—Sensors and actuators

General Terms

Figure 1. SeaAbs: a low cost seafloor observatory

Seafloor Observatory Science, Seafloor observatory

before materials are mounted on

To our knowledge, with the exception of bouy Vida operated

Keywords

by Marine Biology Station Piran (described in e.g. [13]),

Seafloor Observatory, Maritime Data, Sea Exploration, Seafloor

there are no other initiatives directed towards collecting

Observatory Science

ocean data with the scientific objective of long-term moni-

toring of environmental processes in Slovenian waters. This

1.

INTRODUCTION

is the case despite an increasing worldwide demand for such

time-series data sets. Such measurements do not necessar-

Oceans are still one of the least explored ecosystems despite

ily need constant power supply and networking capabilities,

the fact that they greatly influence our climate, and con-

which we took into consideration while building our system.

sequently environment and life on our planet. There has

In this paper we describe our off-line approach suitable for

been a growing trend in the past decades of moving from

obtaining time-series data sets where analysis can be done

exploratory research with underwater vehicles to in-situ ob-

subsequently. To achieve this, we designed a prototype that

servations [5]. Oceans are dynamic and best observed from

was built using available of-the-shelf components.

a static point of view. Dynamic observation can affect the

measurements, which introduces uncertainty as if the mea-

surements are results of the dynamics of the observatory

itself [9, 7, 8, 11].

64





2.

MOTIVATION

Our motivation comes from the need to understand how dif-

ferent materials introduced to marine environments affect its

wildlife. One such example are man-made structures intro-

duced to the underwater forming artificial reefs. These are

placed usually on the featureless bottom seafloor to promote

wildlife or prevent erosion. Artificial reefs are most com-

monly created by scuttling ships, oil rigs, or are purposely

built using construction debris, tires, concrete or plastic (e.g.

PVC). These materials are likely to cause pollution by re-

leasing substances and (micro)particles not naturally found

in sea environment and often become part of the sea-life

food-chain [6].

As inferred above, material interference with live (marine)

ecosystems must be done with great care. For this purpose

we built an observatory that is designed to monitor how dif-

Figure 2. SeaAbs seafloor observatory: on the right

ferent materials are absorbed into marine environment. The

is the tube holding all necessary hardware includ-

purpose of our observatory is to identify appropriate mate-

ing sensors and camera, while on the left are placed

rials for artificial reef constriction and provide guidelines as

different materials

to which materials are better absorbed by marine environ-

ment than others. In addition we will make all collected

data freely available for others to carry out further research.

In this paper we describe hardware, system architecture and

software used in pursuit of these goals.

3.

HARDWARE AND SYSTEM ARCHITEC-

TURE

The system design can be seen in Figure 2. The base of

the observatory measures 120 cm by 80 cm and serves as a

weight that keeps the observatory in place on the sea floor.

The frame of the observatory is made of steel (marked with

blue color on Figure 2) where the left part holds different

materials for which environment absorption is being mea-

sured. On the right part of the frame we have mounted a

watertight custom built containment unit made of transpar-

Figure 3. System Architecture

ent PMMA thermoplastic in a form of a tube that holds all

the electronics of the observatory. The tube can withstand

pressure of 3.5 bar (equivalent to pressure at the depth of

To save the energy the system is turned off most of the

25 meters). This pressure capability is sufficient for planned

time and boots up automatically on pre-configured inter-

observations. However, tubes withstanding higher pressure

vals. Battery life can be increased by prolonging the wake-

can be mounted as well. The tube is locked on the frame

up intervals, hence it is very important to calculate the time

with two clamps and can be easily removed for data reading

interval depending on the duration of the experiment. How-

and battery changing.

ever, as explained above, the power bank and SD card can

be replaced at regular intervals by removing the tube from

The electronics’ system architecture was designed with adapt-

the frame and taking it out of the water, whilst keeping the

ability, affordability and extensibility in mind, and it con-

observatory underwater. This enables us to conduct unin-

sists of three main modules, namely: the power supply mod-

terrupted observations for longer.

ule, the power regulator module, and the data logging mod-

ule (see Figure 3). The modules do not require intermodule

The Power Control module consists of an Arduino and an

communication.

RTC (Real-Time Clock) shield with a dedicated 3V battery

for maintaining the RTC – hence it does not drain the power

One of the main challenges of supporting off-line long-term

from the power bank while in sleep mode. Once the RTC

observations was to keep the battery powered for as long as

triggers a wake-up signal, the Arduino turns on the Data

possible. Our power supply consists of an array of Lithium

Logging module by connecting it to the main Power Supply.

cells forming a power bank. The power bank outputs a reg-

ulated 5V at 2A. The total capacity of the power bank is

Once the Data Logging module is active, the Raspberry Pi

estimated at 50000mAh. The high capacity is needed to

boots up and immediately reads the values from the sensors

achieve the desired operating time considering the expected

and writes all the data to an SD card. After a successful

temperature at the seabed (which can drop to 5 degrees Cel-

write the module shuts down by it self, ready to be turned

sius). The temperature is expected to have a great impact

on again at the next interval. Reading sensors can also be

on the battery capacity [1] as can be seen in Figure 4.

achieved with the Arduino board. However, in our case we

65





• Writes a new line to the measurements file (on SD

card) with a timestamp and stores the image with the

same timestamp.

• Initiates a graceful shutdown

Besides this simple Python script run in rc.local on the Rasp-

berry Pi, we have written a simple script for Arduino PCB

that regulates power supply to the Raspberry Pi. The pur-

pose of the script is to ensure that the system does not drain

unnecessary power after the shut down by cutting off the

power supply to Raspberry Pi after two minutes. The pur-

pose of the script is also to forcefully shut down the Rasp-

berry if the shutdown process of the Raspberry Pi hangs

during the shutdown process.

Figure 4. Battery capacity & battery life compared

All the code for both Raspberry Pi and Arduino is freely

at different temperatures [1]

available in our Git repository http://git.redmine.pint.

upr.si/absorbcija-tujkov-v-morskem-okolju.

needed a good camera module to take photos of the ma-

5.

CONCLUSIONS

terials observed in front of the tube, which Arduino lacks.

By using mostly off-the-shelf open-source software and hard-

Using Raspberry PI also expanded the GPIO pins available

ware we designed and build a seafloor observatory system

and provided several ways to build upon the existing plat-

that is robust and reliable enough for obtaining time-series

form. An abundance of modules and shields with plug &

data sets for subsequent analysis after the system is retrieved

play capabilities are available for both platforms. Adding

from the seafloor (usually just the tube).

Compared to

new sensors and monitoring capabilities is therefore very

other seafloor observatories, our SeaAbs costs just a frac-

simple.

tion of their price (less than 1800e or $2000). This makes

it affordable for wider public and the research community

4.

SOFTWARE

enabling large scale marine environment data collation. Ef-

For the described scenario to work, we use a stock operating

fective data collection tools are indispensable for maritime

system Raspbian that comes pre-installed on Raspberry Pi.

research filed, such as: marine biology, sea ecology, physical

It is a Linux distribution, which allowed us to take advan-

oceanography, and conservation biology[3].

tage of its runlevels. These are the states of the system,

indicating if the system in is the process of booting, reboot-

We have currently built two observatories and are testing

ing, single-user mode, shutting down, or in normal multiuser

them before deploying them into North Adriatic sea. Both

modes. When runlevel changes, init runs so called rc scripts

observatories hold different materials for observing their ab-

that take care of appropriate tasks for that runlevel. After

sorption into marine ecosystem. We are focusing on mate-

all the normal services start at the end of the process of

rials that are abundantly available, comparatively inexpen-

switching to multiuser mode, the init executes /etc/rc.local

sive, and not too invasive in terms of releasing substances

to start custom services.

and (micro)particles into the sea. In the near future we are

planing to deploy both observatories in two different loca-

Our custom service is written in Python and follows the fol-

tions observing 5 different materials on each. Current sys-

lowing procedure, based on the sensors connected to GPIO

tem lacks real-time data processing capabilities which we

and the camera with the flash we use to take time-series of

plan to add in future versions. Another limitation of our

photos:

observatory is also the fact that all components are placed

underwater, hence can only be reached by scuba divers (re-

quired when power bank and memory card need replace-

• Reads the temeprature from the Waterproof DS18B20

ment).

In future versions, we plan to look into ways of

Digital temperature sensor

automating the retrial of the tank holding electronics.

• Reads the the luminosity (Lux) from the TSL2561 Dig-

Irrespective of aforementioned limitations, the project has

ital Luminosity/Lux/Light Sensor

a long-term scope.

Human interventions and operations

are causing global and regional environmental and climate

• Turns on the LED flesh made out of an array of ultra

changes that may dramatically modify the structures and

bright LED’s.

functions of marine (eco)system. The altered resource po-

tentials and ecosystem services in turn affect the livelihoods

• Snaps a photo with of the illuminated observed ma-

of the population depended on these ecosystems. Slovenia

terials (we have placed the lens of the camera on the

is a maritime country economically dependent on the state

front tube cover facing the observed materials in order

of the oceans. Therefore it needs to participate in active

to avoid glare)

monitoring of the oceans through maritime data collection

which may contribute to regional as well as global societal,

• Turns off the LED flash

economic, political, and cultural changes. Our prototype,

66





as an inexpensive open source alternative to more expensive

abyss. Springer Berlin Heidelberg, Berlin, Heidelberg,

seafloor observatories, is a step in this direction.

2015.

[5] S. Favali, P., Person, R., Barnes, C.R., Kaneda, Y.,

Furthermore, data that will be collected by our prototype

Delaney, J.R., Hsu. Seafloor Observatory Science. In

(and similar systems) offers an excellent starting-point for

Proceeding of the 10th IEEE International Conference

different simulations and models of various marine ecosys-

on Advanced Learning Technologies ICALT010, 2010.

tem aspects - one such example could be to use a combi-

[6] L. Jacobson. Artificial Reefs, Sunken Ships, and

nation of computer vision techniques [12] (to transform the

Military Toxins. http://www.toxipedia.org/

captured pictures and/or video feed into numeric features)

display/toxipedia/Artificial+Reefs,+Sunken+

and machine learning algorithms [10] (to build a mathemat-

Ships,+and+Military+Toxins, 2011.

ical model from the data) to model the affluence of vari-

[7] T. Lado Insua, K. Moran, I. Kulin, S. Farrington,

ous fish species to artificial reefs and thus improve our un-

J. Newman, M. Riedel, G. Iturrino, W. Masterson,

derstanding of artificial reef dynamics in large marine and

C. Furman, A. Klaus, M. Storms, J. Attryde,

freshwater systems.

C. Hetmaniak, and D. Huey. The Successful

Delpoyment of a New Sub-seafloor Observatory. In

Acknowledgment

American Geophysical Union Fall Meeting 2013, 2013.

The initial stage of the project was funded by AdFutura

[8] T. Lado-Insua, K. Moran, I. Kulin, S. Farrington, and

through “Path to Practical Knowledge” programme, which

J. B. Newman. SCIMPI: a new borehole observatory.

was carried out with the support of the Ministry of Educa-

Scientific Drilling, 16(16):57–61, nov 2013.

tion, Science and Sport, and the European Social Fund.

[9] T. Lado Insua, K. Moran, I. Kulin, S. Farrington,

M. Riedel, M. Scherwath, M. Heesemann, B. Pirenne,

6.

REFERENCES

G. Iturrino, W. Masterson, and C. Furman. One year

[1] Temperature effects on battery performance & life.

of data of SCIMPI borehole measurements. In

www.heliant.it/images/FV/ev_temperature_

American Geophysical Union Fall Meeting, 2014.

effects.pdf, 2015.

[10] T. M. Mitchell. Machine Learning. McGraw-Hill

[2] M. Chaffey, L. Bird, J. Erickson, J. Graybeal,

Education, New York, NY, USA, 1st edition, 1997.

A. Hamilton, K. Headley, M. Kelley, L. McBride,

[11] K. Moran, S. Farrington, E. Massion, C. Paull,

E. Mellinger, T. Meese, T. O’Reilly, W. Paul, M. Risi,

R. Stephen, A. Tréhu, and W. Ussler. SCIMPI: A new

and W. Radochonski. MBARI’s buoy based seafloor

seafloor observatory system. In OCEANS 2006, 2006.

observatory design. In Oceans ’04 MTS/IEEE

[12] M. Nixon and A. Aguado. Feature Extraction & Image

Techno-Ocean ’04 (IEEE Cat. No.04CH37600),

Processing for Computer Vision, Third Edition.

volume 4, pages 1975–1984. IEEE, 2004.

Academic Press, Oxford, UK, 3rd edition, 2012.

[3] P. Favali and L. Beranzoli. Seafloor observatory

[13] D. Turk, J. Book, and W. McGillis. pCO2 and CO2

science: A review. Annals of Geophysics,

exchange during high bora winds in the Northern

49(2-3):515–567, 2006.

Adriatic. Journal of Marine Systems, 117-118:65–71,

[4] P. Favali, L. Beranzoli, and A. De Santis. Seafloor

may 2013.

observatories: A new vision of the Earth from the

67





Primerjava razdalj pri klasifikaciji in regresiji

Tomaž Stepišnik Perdih

Panče Panov

Andrej Bauer

Fakulteta za matematiko in

Institut Jožef Stefan

Fakulteta za matematiko in

fiziko, Jadranska ulica 19, in

Jamova cesta 39

fiziko, Jadranska ulica 19,

Institut Jožef Stefan, Jamova

Ljubljana, Slovenija

Ljubljana, Slovenija

cesta 39, Ljubljana, Slovenija

pance.panov@ijs.si

andrej.bauer@fmf.uni-

tomaz.stepisnik@ijs.si

lj.si

Sašo Džeroski

Institut Jožef Stefan

Jamova cesta 39

Ljubljana, Slovenija

saso.dzeroski@ijs.si

POVZETEK

in 1 kadar nista.

V članku je narejena primerjava uspešnosti algoritma k naj-

(0, če x = y

bližjih sosedov ob uporabi različnih razdalj med vhodnimi

d(x, y) =

(1)

podatki. Primerjana je uspešnost pri klasifikaciji in regresiji

1,

sicer

s standardno obliko vhodnih podatkov (n-terice z numerič-

nimi ali nominalnimi komponentami).

Pri numeričnih podatkih uporabimo absolutno vrednost raz-

like, vendar moramo poskrbeti še za omejenost na interval

Natančnost za posamezne razdalje je bila dobljena z 10-

[0, 1]. To naredimo tako, da razliko pod določeno mejo za-

delnim prečnim preverjanjem. S Friedmanovim testom je

nemarimo in preslikamo v razdaljo 0, od neke meje naprej

bila testirana hipoteza, da je algoritem z vsemi razdaljami

pa obravnavamo kot maksimalno in preslikamo v razdaljo 1.

enako natančen. V primeru zavrnitve hipoteze so bili pari

Med tema dvema točkama pa linearno interpoliramo.

razdalj primerjani še z Nemenyijevim testom.

|x − y| − a

d(x, y) = min(1, max(0,

))

(2)

b − a

1.

UVOD

Vrednost parametra a je bila pri testiranju vedno 0, vrednost

Valentin Gjorgjioski v svojem doktoratu [4] naredi obširno

b pa odvisna od učne množice, kot bo opisano kasneje. Za

raziskavo o vplivu različnih razdalj na natančnost napove-

lažjo predstavo je odvisnost razdalje od absolutne vrednosti

dnega modeliranja. Posveti se predvsem razdaljam na struk-

razlike prikazana na sliki 1.

turiranih podatkih (časovnih vrstah ter množicah), nekoliko

manj pa razdaljam na najpogostejši strukturi: n-terici. V

tem članku bodo primerjani različni načini agregacije razdalj

na komponentah n-terice v skupno razdaljo.

Najprej bodo v drugem razdelku opisane razdalje, ki so bile

primerjane. V tretjem razdelku je opisan algoritem k naj-

bližjih sosedov in postopek različnih mer uspešnosti algo-

ritma. Opisan je tudi način primerjave razdalj ter podat-

kovne množice, na katerih so bile primerjane. V četrtem

razdelku so predstavljeni rezultati testiranja.

2.

OPIS RAZDALJ

Slika 1. Numerična razdalja v odvisnosti od absolutne vre-

dnosti razlike.

Vhodni podatki so bili podatkovne množice n-teric z nu-

meričnimi ali nominalnimi komponentami.

Razdalja med

Za razdaljo med n-tericama najprej izračunamo n razdalj

dvema n-tericama je bila dobljena tako, da so bile najprej iz-

med istoležečimi komponentami z zgoraj definiranima raz-

računane razdalje med istoležečimi komponentami, nato pa

daljama.

Za skupno razdaljo potem izračunamo uteženo

te razdalje po postopku urejenostno utežene agregacije (ang.

potenčno sredino razdalj med komponentami. Denimo, da

ordered weighted aggregation)[6] združene v skupno razdaljo.

imamo n-terici x

Opisati moramo torej računanje razdalje med numeričnimi

1, . . . , xn in y1, . . . , yn, pozitiven parame-

ter p ter nenegativne uteži o

in nominalnimi komponentami ter postopek agregacije.

1, . . . , on. Razdaljo med njima

dobimo po naslednjih korakih.

Za razdalje bomo zahtevali, da so simetrične (tj. d(x, y) =

d(y, x)) in da slikajo na interval [0, 1]. Za nominalne kompo-

1. Če je vsota uteži O = P oi enaka 0, je tudi skupna

nente uporabimo kar razdaljo, ki je 0, če sta podatka enaka,

razdalja enaka 0. Sicer računamo naprej.

68



2. Izračunamo razdalje di med komponentami.

RMSE (ang. root mean squared error ), RAE (ang. relative

absolute error ) ter RRSE (ang. root relative squared error ).

3. Razdalje med komponentami uredimo po velikosti. Do-

Napovedi naredimo za vsako mero natančnosti posebej, saj

bimo u1, . . . , un od najmanjše do največje.

se lahko pri različnih merah izbrani parametri k razlikujejo.

4. Vrnemo vrednost

Napovedi naredimo za več različnih podatkovnih množic,

1



n

!

1

p

primerjava razdalj pa poteka kot je predlagano v [3]. Za

X oiup

.

(3)

vsako mero uspešnosti s popravljenim Friedmanovim testom

O

i

i=1

testiramo hipotezo, da je algoritem k najbližjih sosedov z

vsemi razdaljami enako uspešen. Če je hipoteza zavrnjena,

Utežem o

z Nemenyjevim testom naredimo še primerjave posameznih

1, . . . , on pravimo urejenostne, saj utežujejo po na-

čelu OWA [6].

parov razdalj. Vsi testi so bili narejeni s stopnjo zaupanja

0,05.

Testirali smo 10 različnih razdalj, ki so se razlikovale veči-

noma v urejenostnih utežeh. Skice uporabljenih uteževanj

so predstavljene na sliki 3. Vse razdalje razen evklidske so

imele parameter p enak 1, pri evklidski pa je bil 2.

3.

METODOLOGIJA

Pri poskusih smo vhodnim podatkom napovedovali njihove

izhode s pomočjo učne množice. Delali smo tako klasifika-

cijo (nominali izhodi) kot regresijo (numerični izhodi). Za

napovedovanje je bil uporabljen algoritem k najbližjih so-

sedov (kNN), znotraj katerega so bile uporabljene razdalje,

Slika 2. Primer rezultatov Nemenyijevih testov.

opisane v prejšnjem razdelku.

Algoritem k najbližjih sosedov, opisan v algoritmu 1, da-

Rezultate predstavimo z grafom, primer katerega je na sliki 2.

nemu vhodnemu podatku poišče k najbližjih vhodov v učni

Na skali je označen povprečen rang razdalje, dosežen na raz-

množici in napove centroid njihovih izhodov. Pri klasifika-

ličnih podatkovnih množicah. Črta pod skalo označuje, za

ciji za izračun centroida naredimo uteženo glasovanje: vsak

koliko se morata povprečna ranga dveh razdalj razlikovati,

izmed k najbližjih vhodov glasuje za svoj izhod, utež pa

da je razlika med njima statistično pomembna.

Črte na

je obratna vrednost njegove oddaljenosti od vhodnega po-

grafu pa povezujejo razdalje, med katerimi ni statistično po-

datka. Centroid je izhod z največ glasovi. Pri regresiji pa

membnih razlik. V tem primeru lahko torej sklepamo, da je

izračunamo uteženo povprečje izhodov, uteži pa so ponovno

razdalja A boljša od razdalj D in C, ter da je B boljša od C.

obratne vrednosti oddaljenosti. Če je oddaljenost katerega

Med ostalimi pari pa ni pomembnih razlik.

soseda 0, je njegova utež neskončna. V tem primeru vse

neskončne uteži zamenjamo z 1, preostale pa z 0, da o cen-

Za regresijo smo uporabili 10 podatkovnih množic, katerih

troidu odločajo enakomerno sosedi z oddaljenostjo 0.

lastnosti so prikazane v tabeli 1. Klasifikacija je bila nare-

jena v dveh delih. V prvem delu je bilo uporabljenih 10 po-

Algoritem 1 k najbližjih sosedov

datkovnih množic z nizkimi dimenzijami vhodov (podobno

Vhod: Učna množica T , razdalja na vhodih d, vhod x.

kot pri regresiji), predstavljenih v tabeli 2. V drugem delu

Izhod: Napoved za izhod od x.

pa smo uporabili podatkovne množice s precej višjimi di-

1: Izberi k glede na T in d.

menzijami vhodov. Bilo jih je prav tako 10, opisane pa so v

2: Izračunaj razdaljo d med x in vhodi v T .

tabeli 3. Podatkovne množice z nizkodimenzionalnimi vhodi

3: Poišči izhode k-tih najbližjih vhodov.

so bile pridobljene s spletne strani [2], množice z visokodi-

4: vrni: Centroid teh izhodov.

menzionalnimi vhodi pa s strani [1]. Izbira množic je bila

pretežno naključna, z nekoliko ozira na čim večjo raznolikost

Primerjali smo uspšenost napovedovanja z različnimi razda-

ljami v algoritmu kNN. Uspešnost smo merili z 10-delnim

Tabela 1. Podatkovne množice za regresijo

prečnim preverjanjem. Za vsako od desetih učnih množic

je bil posebej določen parameter b pri numerični razdalji

podat. množica

primerov

atributov

(enačba 2) in sicer kot razpon vseh števil na tej komponenti

pri podatkih v učni množici. Potem je bil glede na učno mno-

quake

2178

3

žico izbran tudi parameter k. Z metodo ”izloči enega”(ang.

pbc

418

18

leave-one-out ) dobimo napovedi za podatke v učni množici

housing

506

13

za k med 1 in 50, ter izberemo tisti k, pri katerem je napaka

meta

528

21

najmanjša (oz. natančnost največja).

strike

625

6

abalone

4177

8

Uporabimo mere uspešnosti iz programskega orodja WEKA[5].

autoMpg

398

7

Za klasifikacijo uporabimo utežena povprečja mer natanč-

hungarian

294

13

nost (ang. precision), priklic (ang. recall ) in f1 po razre-

cpu

209

7

dih, za regresijo pa mere MAE (ang. mean absolute error ),

pharynx

195

11

69





maximum

stopnica

sigmoida

linearna

povprečje,evklidska

obrezana

gauss

trikotnik

mediana

Slika 3. Skice urejenostnih uteži.

v številu primerov, atributov, pri klasifikaciji tudi razredov.

Tabela 2. Podatkovne množice za klasifikacijo.

podat. množica

primerov

atributov

razredov

iris

150

4

3

diabetes

768

8

2

credit-a

690

15

2

car

1728

6

4

primary-tumor

339

17

22

autos

205

25

7

mfeat-morpho

2000

6

10

vote

435

16

2

(a) MAE

tic-tac-toe

958

9

2

credit-g

1000

20

2

Tabela 3. Podatkovne množice visokih dimenzij za klasifi-

kacijo.

podat. množica

primerov

atributov

razredov

amlPrognosis

54

12625

2

(b) RMSE

bladderCancer

40

5724

3

breastCancer

24

12625

2

childhoodAll

110

8280

2

cmlTreatment

28

12625

2

dlbcl

77

7070

2

levkemija

72

5147

2

mll

72

12533

3

prostata

102

12533

2

srbct

83

2308

4

(c) RAE

4.

REZULTATI

4.1

Regresija

Pri regresijskih podatkih je Friedmanov test hipotezo ovrgel

za vse štiri mere natančnosti. To pomeni, da med različnimi

razdaljami obstajajo razlike v uspešnosti. Te so opisane z

rezultati Nemenyjevih testov, predstavljenih na sliki 4.

(d) RRSE

Pri meri MAE je sigmoida boljša od mediane in maksimuma,

Slika 4. Diagrami povprečnih rangov z rezultati Nemenyi-

poleg tega pa so od mediane boljše tudi stopnica, linearna

jevih testov za kNN z različnimi razdaljami na regresijskih

in evklidska. Pri RMSE je sigmoida boljša od maksimuma,

množicah.

pri RRSE pa linearna od mediane. Pri RAE so od mediane

boljše linearna, sigmoida in povprečje.

Če gledamo samo

povprečne range lahko vidimo, da nobena razdalja nima pre-

cej večjega uspeha od ostalih, da pa sta ranga mediane in

maximuma večinoma precej slabša od rangov ostalih razdalj.

70





(a) natančnost

(b) priklic

(c) f1

Slika 5.

Diagrami povprečnih rangov z rezultati Nemenyijevih testov za kNN z različnimi razdaljami na klasifikacijskih množicah nizke dimenzije.

(a) natančnost

(b) priklic

(c) f1

Slika 6.

Diagrami povprečnih rangov z rezultati Nemenyijevih testov za kNN z različnimi razdaljami na klasifikacijskih množicah visoke dimenzije.

4.2

Klasifikacija

like razdalje med komponentami (z izjemo maksimuma, ki je

Tudi tukaj je Friedmanov test hipotezo zavrgel za vse mere.

ekstremen primer teh razdalj). V večini primerov ni bilo ene

Razlike v uspešnosti opisujejo rezultati Nemenyijevih testov

razdalje, ki bi bila bistveno uspešnejša od ostalih, pogosto

na sliki 5.

pa sta mediana in maksimum odstopala v negativno smer.

Testi pri meri natančnost niso pokazali nobenih statistično

Delo je predvsem podlaga za nadaljnje raziskovanje. Primer-

pomembnih razlik. Pri meri priklic je stopnica boljša od

java razdalj se lahko naredi tudi v primerih, ko so vhodni

maksimuma in mediane, od mediane pa so boljše tudi gauss,

ali izhodni podatki strukturirani (zaporedja, množice, . . . ).

sigmoida, povprečje in evklidska. Pri meri f1 sta gauss in

Program, narejen za testiranje, omogoča algoritmu k naj-

stopnica boljša od maksimuma in mediane. Na zadnjih treh

bližjih sosedov, da glede na trening množico podobno kot

mestih po povprečnih rangih so vedno obrezana, maksimum

parameter k izbere tudi razdaljo, ki jo bo uporabljal za na-

in mediana.

povedovanje, izmed množice možnih. Več možnih razdalj

načeloma poveča možnosti za dobro napovedovanje, vendar

4.3

Klasifikacija podatkov visoke dimenzije

poveča časovno zahtevnost izbiranja razdalje. Zanimivo bi

bilo na primer testirati, kateri nabori razdalj dobro delujejo

Friedmanov test znova hipotezo zavrne pri vseh merah. Raz-

skupaj.

like v uspešnosti opišejo rezultati Nemenyijevih testov na

sliki 6.

6.

VIRI

Rezultati so precej drugačni kot pri podatkih nizke dimen-

[1] Bioinformatics laboratory.

zije. Pri vseh treh merah je sedaj le maksimum statistično

http://www.biolab.si/supp/bi-cancer/projections/.

slabši od prvih nekaj razdalj.

Če pogledamo povprečne

Obiskano: 2016-06-24.

range, vidimo da je res daleč za ostalimi razdaljami, ki so

[2] UCI Machine Learning Repository.

tukaj bolj na kupu kot v prejšnjih primerih. Zanimivo je,

http://archive.ics.uci.edu/ml/. Obiskano: 2016-06-24.

da se mediana odreže toliko bolje (pri meri f1 celo najbolje),

[3] J. Demšar. Statistical comparisons of classifiers over

saj ravno tako kot maksimum za skupno razdaljo vzame le

multiple data sets. Journal of Machine Learning

eno od več tisočih komponent.

Research, 7:1–30, 2006.

[4] V. Gjorgjioski. Distance-Based Learning from

5.

ZAKLJU ČEK

Structured Data. PhD thesis, Mednarodna podiplomska

Primerjana je bila uspešnost algoritma k najbližjih sosedov

šola Jožefa Stefana, Ljubljana, 2014.

ob uporabi različnih razdalj med podatki, v primerih ko so

[5] M. Hall, E. Frank, G. Holmes, B. Pfahringer,

vhodni podatki n-terice z numeričnimi ali nominalnimi kom-

P. Reutemann, and I. H. Witten. The WEKA Data

ponentami, ciljna spremenljivka pa prav tako numerična (re-

Mining Software: An Update. SIGKDD Explor. Newsl.,

gresija) ali nominalna (klasifikacija). Uporabljenih je bilo

11(1):10–18, Nov. 2009.

več mer uspešnosti, merjenih z 10-delnim prečnim prever-

[6] R. R. Yager and J. Kacprzyk, editors. The Ordered

janjem, primerjava pa je bila narejena s Friedmanovimi in

Weighted Averaging Operators. Springer, 1997.

Nemenyijevimi testi.

Testi so pokazali večjo uspešnost razdalj, ki bolj utežijo ve-

71





Avtomatska prepoznava teniških udarcev iz senzorskih

podatkov

Miha Mlakar

Institut »Jožef Stefan«

Jamova cesta 39

1000 Ljubljana

miha.mlakar@ijs.si



POVZETEK

2. PRIDOBITEV PODATKOV

V tem članku smo prikazali postopek izdelave sistema za

Za testiranje smo posneli šest različnih tekem s petimi različnimi

avtomatsko prepoznavo teniških udarcev iz senzorskih podatkov.

igralci. Ker nas je zanimala avtomatska prepoznava udarcev v

Prikazali smo, kako je potekal zajem, kakšno je potrebno pred-

realnem okolju, smo vse posnetke posneli med tekmami, ne med

procesiranje podatkov, izračun potrebnih atributov in nato gradnja

treningi v predvidljivem okolju.

modelov. Modele smo ovrednotili na dva različna načina in

prikazali, da dobljeni model avtomatske prepoznave udarcev

Igralci so imeli na sebi napravo S5 podjetja Catapult Sports, ki so

deluje zelo dobro.

ga nosili na hrbtu med lopaticami v zato namenjeni oprijeti majici

(slika 1)

Pridobljeni sistem je mišljen kot začetni del celovitega sistema za

analizo teniške igre iz senzorskih podatkov. Prepoznavanje tipov

udarcev je osnova iz katere bodo nato izhajale prihodnje analize.

Splošni izrazi

Algoritmi, meritve, testiranje, poskusi.

Ključne besede

Senzorske meritve, tenis, pospeškomer, teniški udarci.

1. UVOD

Uporaba oblačilnih senzorjev, ki jih športniki nosijo med športno

vadbo je trenutno v velikem porastu. Različni senzorji in pristopi

se uporabljajo v večini športov. Naprave, ki uporabljajo različne

senzorje za izvajanje analiz in meritev v tenisu lahko razdelimo na

tri skupine. Prva skupina so senzorji, ki se jih nosi na roki,



oziroma so vgrajeni v teniški lopar. Ti senzorji nam prinašajo

Slika1: Lokacija senzorja, ki smo ga uporabljali za meritve. Vir:

informacije o udarcih, vendar je ta informacija brez konteksta in

sporttechie.com

je zato težko uporabna za izboljševanje teniške igre. Druga

skupina so senzorji, ki se nosijo na hrbtu in poleg pospeškomerov

Naprava S5 je vseboval naslednje senzorje:

in žiroskopov vsebujejo GPS sprejemnik za zaznavanje premikov

-

Tridimenzionalni

pospeškomter

(osveževanje

s

igralca. Vendar pa se GPS izkaže za relativno nenatančnega, tako

frekvenco 100HZ)

da so napake velike od 3-5 metrov, kar je glede na velikost

teniškega igrišča preveč, da bi bile lahko meritve uporabne za

-

Tridimenzionalni žiroskop (osveževanje s frekvenco

določanje položaja na igrišču. Pole tega ti sistemi ne nudijo

100HZ)

nobenih teniško prilagojenih meritev in analiz. Tretja skupina pa

so sistemi, ki temeljijo na video analizi. Pri teh sistemih je

-

Tridimenzionalni

magnetometer

(osveževanje

s

potrebno postaviti 4-8 kamer okoli igrišča in potrebno je kupiti

frekvenco 100HZ)

pripadajočo programsko opremo. Zaradi tega so ti sistemi izredno

-

GPS senzor, ki je vračal pozicijo v obliki zemljepisna

dragi in so vezani na igrišče in ne na igralca, tako da, če igralec

širina in dolžina (osveževanje s frekvenco 10HZ)

igra na nekem drugem igrišču, mu analize niso na razpolago.



Kljub tem različnim možnostim pa na trgu trenutno ni senzorja, ki

bi prepoznaval tako gibanje kot tenisko specifične metrike. Zato

Na ta način smo posneli za več kot 6 ur posnetkov. Ker so bili

smo vzeli obstoječi senzor S5 podjetja Catapult Sports in ga

podatki pridobljeni 100 krat na sekundo smo tako pridobili

uporabili za naše namene. Iz senzorja smo pridobili surove

2.172.363 podatkovnih vnosov. V tem času smo zabeležili 1.373

podatke, ki smo jih nato analizirali in izdelali avtomatski sistem

udarcev. Vsak udarec smo označili kot eno izmed naslednjih

za prepoznavanje teniških udarcev. Prepoznava udarcev nam bo

možnosti:

služila kot izhodišče za vse nadaljnje analize teniške igre, zato je

-

Servis

kvalitetna prepoznava udarcev zelo pomembna.

-

Forehand

72





-

Backhand



Izračunana razlika med vrednostma kotnih hitrosti za osi 1 in 3



žiroskopa pri udarcu nam je služila kot začetno izhodišče pri

avtomatskem določanju tipa udarcev.

3. PRED-PROCESIRANJE



Na podlagi izrisanih signalov za pospeške in hitrosti obračanja

Postopek pridobivanja spremenljivke, ki nam pomaga pri

smo najprej poskusili ugotoviti kaj so ključne značilke za

odkrivanju ali gre pri neki kombinaciji senzorskih vrednosti za

diferenciacijo udarcev. Na sliki 2 lahko vidimo tipično sled

udarec sledi sledečim korakom:

pospeškomera in žiroskopa pri backhand udarcu.

-

izračunamo

novo spremenljivko

Spin, ki nam



predstavlja absolutno vsoto kotnih hitrosti pridobljenih

z žiroskopom

-

spremenljivko Spin potenciramo, da damo poudarek

večjim vrednostim

-

apliciramo

Butterworth

pasovni

filter

[1]

na

spremenljivko Spin

-

negativne vrednosti popravimo na 0

-

omejimo minimalno razdaljo med dvema vrhovoma

(razdaljo smo eksperimentalno določili na 1,3 sekunde)



Na ta način dobimo spremenljivko, ki nam označuje potencialne

udarce in jo poimenujemo Peaks. Ta spremenljivka seveda ni

dovolj za prepoznavo ali se je nek udarec zgodil v nekem trenutku

ali ne, nam pa pomaga pri identifikaciji. V gradnjo modela je

potrebno vključiti še dodatne spremenljivke, kot so povprečne

vrednosti in standardni odkloni posameznih parametrov ter vsote



in razlike pospeškov in kotnih hitrosti po različnih oseh.

Slika 2: Vrednosti pospeškomera in žiroskopa skozi čas pri



backhand udarcu označenemu z navpično črno črto.

Kot primer je na sliki 4 prikazano, kako nam spremenljivki Peaks



(prikazane so le vrednosti večje od 400) in hitrost premikanja

Ko smo sled žiroskopa pri backhandu primerjali s sledjo pri

pridobljena iz GPS podatkov razdelita prostor, kar nam pomaga

forhendu smo ugotovili, da se ključno razlikujeta po vrednostih

pri identifikaciji udarcev.

kotnih hitrosti izmerjenih po osi 1 in 3. Na sliki 3 je prikazana



razlika.





Slika 4: Razdelitev prostora potencialnih udarcev glede na

spremenljivki Peaks in hitrost premikanja.





4. TESTIRANJE

Testiranje smo razdelili na dva dela. Najprej smo testirali, kako

točno lahko napovemo ali in kdaj se je zgodil udarec. V drugem

delu pa smo poskusili napovedati tudi tip udarca – forehand,

backhand ali servis.

Za gradnjo modela za napovedovanje udarcev smo v obeh

primerih uporabili metodologijo naključnih gozdov (RF) [2].



Slika 3: Razlika v vrednostih kotnih hitrosti za os 1 in 3 za

Vzeli smo implementacijo v programskem jeziku Python

žiroskop pri backhand in forehand udarcu.

implementirano v paketu scikit-learn. Metoda RF je vsebovala 10

73



naključnih dreves, ostalih parametrov pa nismo spreminjali, tako

5.2 Navzkrižno preverjanje s po enim

da so imeli privzete vrednosti.

izpuščenim igralcem

Model smo testirali na dva načina. Pri prvem načinu smo za



preverjanje točnosti modela uporabili metodo desetkratnega

Z metodo navzkrižnega preverjanja s po enim izpuščenim

prečnega preverjanja [3], ki je za deljenje podatkov znotraj

igralcem smo dobili rezultate za natančnost in priklic ločeno za

prečnega preverjanja uporabljala postopek uravnoteženega

vsakega igralca.

naključne delitve (Stratified shuffle split) [4]. Ta postopek

zagotavlja, da so pri vsakem prečnem preverjanju razmerja med

Rezultati klasifikacije za forehand udarec:

razredi v testni in učni množici enaka. To nam zagotovi pravilno

povprečje rezultata čez celotno prečno preverjanje.

-

Natančnost: 91.5% [94%, 91%, 84%, 96%, 92%]

Pri drugem načinu smo testirali točnost modela po metodi

-

Priklic: 90.5% [91%, 88%, 86%, 91%, 96%]

navzkrižnega preverjanja s po enim izpuščenim igralcem, ki

Rezultati klasifikacije za backhand udarec:

deluje po principu, da se model nauči na podatkih, ki ne vsebuje

vrednosti izpuščenega igralca. Te vrednosti pa se nato uporabijo

-

Natančnost: 93.6% [98%, 98%, 84%, 91%, 95%]

za testiranje točnosti modela. Ta postopek se ponovi tolikokrat,

-

Priklic: 90.6% [96%, 78%, 85%, 96%, 95%]

kolikor imamo različnih igralcev. Tak pristop nam omogoča

vpogled v univerzalnost modela, saj nam njegovo dobro delovanje

Rezultati klasifikacije za servis udarec:

pove, da bo model uspešno napovedoval udarce tudi za igralce, ki

-

Natančnost: 99.8% [100%, 100%, 100%, 99%, 100%]

jih nismo vključili v testiranje.

-

Priklic: 98.2% [98%, 98%, 100%, 99%, 95%]



Skupni povprečni rezultati preko vseh udarcev (in igralcev):

5. REZULTATI

-

Natančnost 95.0%

5.1 Metoda prečnega preverjanja

-

Priklic 93.1%

Z metodo prečnega preverjanja smo klasificirali točnost

Kot lahko vidimo iz rezultatov sta natančnost in priklic

napovedovanja forehand, backhand in servis udarcev.

pridobljena z metodama prečnega preverjanja in navzkrižnega

Rezultati klasifikacije za forehand udarec:

preverjanja s po enim izpuščenim igralcem podobna. To pomeni,

da je model relativno neodvisen od tipa igralca, oziroma od

-

Natančnost 95.3%

tehnike izvedbe udarcev in da model, ki smo ga naučili na ostalih

-

Priklic 91.4%

udarcih dobro napoveduje tudi za izbranega igralca.

Rezultati klasifikacije za backhand udarec:

Glavni vir napak so posebnosti, ki se dogodijo med samo igro, kot

je recimo udarec senzorja z loparjem, posebni gibi, ki jih ne

-

Natančnost 94.3%

moremo klasificirati med izbrane tri udarce in pa posebni primeri

hitrega obračanja igralca okrog svoje osi, kar rezultira v podobnih

-

Priklic 90.2%

vrednostih senzorjev, kot jih imamo pri udarcih.

Rezultati klasifikacije za servis udarec:

Ker nam model zgrajen z RF ne omogoča prikaza kompleksnost

-

Natančnost 99.1%

smo zgradili odločitveno drevo za detekcijo tipov udarcev (slika

5). Ker je drevo preveč kompleksno za podroben prikaz,

-

Priklic 99.3%

prikazujemo le shemo drevesa, da prikažemo strukturo in

Skupni povprečni rezultati preko vseh udarcev:

kompleksnost dobljenega modela. Kot lahko vidimo je drevo

razvejano in uporablja veliko različnih atributov v vozliščih.

-

Natančnost 96.2%

-

Priklic 93.6%

6. ZAKLJUČEK

V članku smo pokazali postopek za gradnjo modela za

74

avtomatsko detekcijo teniških udarcev iz realni tekem in ne le

[2] Liaw, Andy, and Matthew Wiener. "Classification and

treningov v kontroliranem okolju. Vsak igralec je na hrbtu nosil

regression by randomForest." R news 2.3 (2002): 18-22.

posebno napravo s senzorji iz katerih smo nato s pred-

[3] Kohavi, Ron. "A study of cross-validation and bootstrap for

procesiranjem in izdelavo atributov zgradili model, ki nam je

accuracy estimation and model selection." Ijcai. Vol. 14. No.

služil za prepoznavo udarcev. Natančnost in priklic modela sta

2. 1995.

bila nad 93%, kar je zadovoljiv rezultat. Z dodatnim pred-

procesiranjem podatkov, ki bi odstranil posebnosti in pogladil

[4] Balmin, Andrey, et al. "Stratified sampling using adaptive

vrednosti senzorjev, pa bi natančnost in priklic lahko še povečali.

parallel data processing." U.S. Patent Application No.

14/141,635.

Pridobljeni model je dovolj dober, da nam bo lahko v prihodnosti



služil kot izhodišče za nadaljnje raziskave in analize v tenisu.





7. LITERATURA

[1] Selesnick, Ivan W., and C. Sidney Burrus. "Generalized

digital Butterworth filter design." IEEE Transactions on

Signal Processing 46.6 (1998): 1688-1694.



75





76





Indeks avtorjev / Author index



Bauer Andrej ................................................................................................................................................................................ 68

Bohanec Marko .............................................................................................................................................................................. 5

Češnovar Rok ................................................................................................................................................................................. 9

Cvetković Božidara ...................................................................................................................................................................... 13

Dimitriev Aleksandar ................................................................................................................................................................... 17

Dovgan Erik ................................................................................................................................................................................. 21

Džerovski Saso ............................................................................................................................................................................. 68

Fabjan David A. ........................................................................................................................................................................... 60

Filipič Bogdan .............................................................................................................................................................................. 52

Frešer Martin ................................................................................................................................................................................ 13

Gams Matjaž .......................................................................................................................................................................... 44, 56

Gjoreski Martin ...................................................................................................................................................................... 13, 56

Gradišek Anton ............................................................................................................................................................................ 25

J. Kok Esther .................................................................................................................................................................................. 5

Janko Vito .................................................................................................................................................................................... 28

Javornik Anže ............................................................................................................................................................................... 32

Kljun Matjaž ................................................................................................................................................................................. 64

Kocjan Andraž ............................................................................................................................................................................. 25

Kononenko Igor ........................................................................................................................................................................... 32

Kosiedowski Michał ..................................................................................................................................................................... 13

Kužnar Damjan ............................................................................................................................................................................ 44

Luštrek Mitja .................................................................................................................................................................... 13, 28, 56

Mileva Boshkoska Biljana.............................................................................................................................................................. 5

Mlakar Miha ........................................................................................................................................................................... 25, 72

Panov Panče ................................................................................................................................................................................. 68

Pevec Darko ................................................................................................................................................................................. 32

Ploj Bojan ..................................................................................................................................................................................... 36

Robnik-Šikonja Marko ................................................................................................................................................................. 40

Šef Tomaž .................................................................................................................................................................................... 48

Stepišnik Perdih Tomaž ............................................................................................................................................................... 68

Štrumbelj Erik .......................................................................................................................................................................... 9, 17

Terbuc Martin ............................................................................................................................................................................... 36

Tošić Aleksandar .......................................................................................................................................................................... 64

Tušar Tea ...................................................................................................................................................................................... 52

W. Prins Theo ................................................................................................................................................................................. 5

Žitnik Lovro ................................................................................................................................................................................. 40

Zupančič Jernej ............................................................................................................................................................................ 44





77





78





Konferenca / Conference

Uredili / Edited by

Slovenska konferenca o umetni inteligenci /

Slovenian Conference on Artificial Intelligence

Matjaž Gams, Mitja Luštrek, Rok Piltaver





Document Outline


Blank Page

Blank Page

Blank Page

01_18.pdf 01 - Boshkoska

02 - sigproc-sp

03 - Cvetkovic

04 - sigproc-sp

05 - Dovgan

06 - Gradisek - HEA IS 2016

07 - Janko

08 - Javornik

09 - Ploj---IS2016_lektorirano

10 - ebook-recommendations-2016-revised

11 - Zupancic - metis_use_case Introduction

Related work

Experimental setup The data

Model evaluation

The grid





Results Metis use-case results

The grid experience





Conclusion

References





12 - Inteligentni_sistemi_IS2016_Šef_Tomaž

13 - Tusar

14 - GjoreskiMartin

15 - IS_2016_DAF_EN_v14

16 - Kljun sigproc-sp Introduction

Motivation

Hardware and System Architecture

Software

Conclusions

References





17 - tomaz-stepisnik

18 - Avtomatska prepoznava teniških udarcev iz senzorskih podatkov - Miha Mlakar_corrected





Blank Page

Blank Page

Blank Page