Zbornik 17. mednarodne multikonference

INFORMACIJSKA DRUŽBA – IS 2014

Zvezek A

Proceedings of the 17th International Multiconference

INFORMATION SOCIETY – IS 2014





Volume A


Inteligentni sistemi

Intelligent Systems

Uredila / Edited by

Rok Piltaver, Matjaž Gams

http://is.ijs.si

7.−8. oktober 2014 / October 7th−8th, 2014

Ljubljana, Slovenia





Zbornik 17. mednarodne multikonference

INFORMACIJSKA DRUŽBA – IS 2014

Zvezek A



Proceedings of the 17th International Multiconference

INFORMATION SOCIETY – IS 2014





Volume A




Inteligentni sistemi

Intelligent Systems



Uredila / Edited by



Rok Piltaver, Matjaž Gams





http://is.ijs.si





7. - 8. oktober 2014 / October 7th - 8th, 2014

Ljubljana, Slovenia





Urednika:





Rok Piltaver

Odsek za inteligentne sisteme

Institut »Jožef Stefan«, Ljubljana



Matjaž Gams

Odsek za inteligentne sisteme

Institut »Jožef Stefan«, Ljubljana





Založnik: Institut »Jožef Stefan«, Ljubljana

Priprava zbornika: Mitja Lasič, Vesna Lasič, Lana Zemljak

Oblikovanje naslovnice: Vesna Lasič, Mitja Lasič





Ljubljana, oktober 2014





CIP - Kataložni zapis o publikaciji

Narodna in univerzitetna knjižnica, Ljubljana



004.89(082)(0.034.2)



MEDNARODNA multikonferenca Informacijska družba (17 ; 2014 ; Ljubljana)

Inteligentni sistemi [Elektronski vir] : zbornik 17. mednarodne multikonference - IS 2014, 7-8 oktober 2014, Ljubljana, Slovenija : zvezek A = Intelligent systems : proceedings of the 17th International

Multiconference Information Society - IS 2014, October 7th-8th, 2014, Ljubljana, Slovenia : volume A /

uredila/edited by Rok Piltaver, Matjaž Gams. - El. knjiga. - Ljubljana : Institut Jožef Stefan, 2014



Način dostopa (URL): http://library.ijs.si/Stacks/Proceedings/InformationSociety



ISBN 978-961-264-071-2 (pdf)

1. Gl. stv. nasl. 2. Vzp. stv. nasl. 3. Dodat. nasl. 4. Piltaver, Rok

27986727



PREDGOVOR MULTIKONFERENCI

INFORMACIJSKA DRUŽBA 2014





Multikonferenca Informacijska družba (http://is.ijs.si) s sedemnajsto zaporedno prireditvijo postaja tradicionalna kvalitetna srednjeevropska konferenca na področju informacijske družbe, računalništva in informatike.

Informacijska družba, znanje in umetna inteligenca se razvijajo čedalje hitreje. Čedalje več pokazateljev kaže, da prehajamo v naslednje civilizacijsko obdobje. Npr. v nekaterih državah je dovoljena samostojna vožnja inteligentnih avtomobilov, na trgu pa je moč dobiti kar nekaj pogosto prodajanih tipov avtomobilov z avtonomnimi funkcijami kot »lane assist«. Hkrati pa so konflikti sodobne družbe čedalje bolj nerazumljivi.



Letos smo v multikonferenco povezali dvanajst odličnih neodvisnih konferenc in delavnic. Predstavljenih bo okoli 200 referatov, prireditev bodo spremljale okrogle mize, razprave ter posebni dogodki kot svečana podelitev nagrad.

Referati so objavljeni v zbornikih multikonference, izbrani prispevki bodo izšli tudi v posebnih številkah dveh znanstvenih revij, od katerih je ena Informatica, ki se ponaša s 37-letno tradicijo odlične evropske znanstvene revije.



Multikonferenco Informacijska družba 2014 sestavljajo naslednje samostojne konference:

•

Inteligentni sistemi

•

Izkopavanje znanja in podatkovna skladišča

•

Sodelovanje, programska oprema in storitve v informacijski družbi

•

Soočanje z demografskimi izzivi

•

Vzgoja in izobraževanje v informacijski družbi

•

Kognitivna znanost

•

Robotika

•

Jezikovne tehnologije

•

Interakcija človek-računalnik v informacijski družbi

•

Prva študentska konferenca s področja računalništva

•

Okolijska ergonomija in fiziologija

•

Delavnica Chiron.



Soorganizatorji in podporniki konference so različne raziskovalne in pedagoške institucije in združenja, med njimi tudi ACM Slovenija, SLAIS in IAS. V imenu organizatorjev konference se želimo posebej zahvaliti udeležencem za njihove dragocene prispevke in priložnost, da z nami delijo svoje izkušnje o informacijski družbi.

Zahvaljujemo se tudi recenzentom za njihovo pomoč pri recenziranju.



V 2014 bomo drugič podelili nagrado za življenjske dosežke v čast Donalda Michija in Alana Turinga. Nagrado Michie-Turing za izjemen življenjski prispevek k razvoju in promociji informacijske družbe je prejel prof. dr.

Janez Grad. Priznanje za dosežek leta je pripadlo dr. Janezu Demšarju. V letu 2014 četrtič podeljujemo nagrado

»informacijska limona« in »informacijska jagoda« za najbolj (ne)uspešne poteze v zvezi z informacijsko družbo.

Limono je dobila nerodna izvedba piškotkov, jagodo pa Google Street view, ker je končno posnel Slovenijo.

Čestitke nagrajencem!





Niko Zimic, predsednik programskega odbora

Matjaž Gams, predsednik organizacijskega odbora



i

FOREWORD - INFORMATION SOCIETY 2014



The Information Society Multiconference (http://is.ijs.si) has become one of the traditional leading conferences in Central Europe devoted to information society. In its 17th year, we deliver a broad range of topics in the open academic environment fostering new ideas which makes our event unique among similar conferences, promoting key visions in interactive, innovative ways. As knowledge progresses even faster, it seems that we are indeed approaching a new civilization era. For example, several countries allow autonomous card driving, and several car models enable autonomous functions such as “lane assist”. At the same time, however, it is hard to understand growing conflicts in the human civilization.



The Multiconference is running in parallel sessions with 200 presentations of scientific papers, presented in twelve independent events. The papers are published in the Web conference proceedings, and a selection of them in special issues of two journals. One of them is Informatica with its 37 years of tradition in excellent research publications.



The Information Society 2014 Multiconference consists of the following conferences and workshops:

•

Intelligent Systems

•

Cognitive Science

•

Data Mining and Data Warehouses

•

Collaboration, Software and Services in Information Society

•

Demographic Challenges

•

Robotics

•

Language Technologies

•

Human-Computer Interaction in Information Society

•

Education in Information Society

•

1st Student Computer Science Research Conference

•

Environmental Ergonomics and Psysiology

•

Chiron Workshop.



The Multiconference is co-organized and supported by several major research institutions and societies, among them ACM Slovenia, SLAIS and IAS.



In 2014, the award for life-long outstanding contributions was delivered in memory of Donald Michie and Alan Turing for a second consecutive year. The Programme and Organizing Committees decided to award the Prof. Dr.

Janez Grad with the Michie-Turing Award. In addition, a reward for current achievements was pronounced to Prof.

Dr. Janez Demšar. The information strawberry is pronounced to Google street view for incorporating Slovenia, while the information lemon goes to cookies for awkward introduction. Congratulations!



On behalf of the conference organizers we would like to thank all participants for their valuable contribution and their interest in this event, and particularly the reviewers for their thorough reviews.



Niko Zimic, Programme Committee Chair

Matjaž Gams, Organizing Committee Chair





ii

KONFERENČNI ODBORI

CONFERENCE COMMITTEES



International Programme Committee

Organizing Committee

Vladimir Bajic, South Africa

Matjaž Gams, chair

Heiner Benking, Germany

Mitja Luštrek

Se Woo Cheon, Korea

Lana Zemljak

Howie Firth, UK

Vesna Koricki-Špetič

Olga S. Fomichova, Russia

Mitja Lasič

Vladimir A. Fomichov, Russia

Robert Blatnik

Vesna Hljuz Dobric, Croatia

Mario Konecki

Alfred Inselberg, Izrael

Vedrana Vidulin

Jay Liebowitz, USA



Huan Liu, Singapore



Henz Martin, Germany



Marcin Paprzycki, USA

Karl Pribram, USA

Claude Sammut, Australia

Jiri Wiedermann, Czech Republic

Xindong Wu, USA

Yiming Ye, USA

Ning Zhong, USA

Wray Buntine, Finland

Bezalel Gavish, USA

Gal A. Kaminka, Israel

Mike Bain, Australia

Michela Milano, Italy

Derong Liu, Chicago, USA

Toby Walsh, Australia



Programme Committee

Nikolaj Zimic, chair

Matjaž Gams

Ivan Rozman

Franc Solina, co-chair

Marko Grobelnik

Niko Schlamberger

Viljan Mahnič, co-chair

Nikola Guid

Stanko Strmčnik

Cene Bavec, co-chair

Marjan Heričko

Jurij Šilc

Tomaž Kalin, co-chair

Borka Jerman Blažič Džonova

Jurij Tasič

Jozsef Györkös, co-chair

Gorazd Kandus

Denis Trček

Tadej Bajd

Urban Kordeš

Andrej Ule

Jaroslav Berce

Marjan Krisper

Tanja Urbančič

Mojca Bernik

Andrej Kuščer

Boštjan Vilfan

Marko Bohanec

Jadran Lenarčič

Baldomir Zajc

Ivan Bratko

Borut Likar

Blaž Zupan

Andrej Brodnik

Janez Malačič

Boris Žemva

Dušan Caf

Olga Markič

Leon Žlajpah

Saša Divjak

Dunja Mladenič

Igor Mekjavić

Tomaž Erjavec

Franc Novak

Tadej Debevec

Bogdan Filipič

Vladislav Rajkovič

Andrej Gams

Grega Repovš





iii



iv



KAZALO / TABLE OF CONTENTS



Inteligentni sistemi / Intelligent Systems ................................................................................................................. 1

PREDGOVOR / FOREWORD ................................................................................................................................. 3

Multiobjective Optimisation of Water Heater Scheduling / Brence Jure, Gosar Žiga, Seražin Vid, Zupančič Jernej, Gams Matjaž ........................................................................................................................................... 5

Analiza nakupov in modeliranje pospeševanja prodaje v spletni trgovini mercator / Černe Matija, Kaluža Boštjan, Luštrek Mitja ......................................................................................................................................... 9

Analiza možnosti zaznavanja podobnosti med uporabniki / Cvetković Božidara, Luštrek Mitja .......................... 14

Visualization of Explanations of Incremental Models / Demšar Jaka, Bosnić Zoran, Kononenko Igor ................ 18

Detection of Irregularities on Automotive Semiproducts / Dovgan Erik, Gantar Klemen, Koblar Valentin, Filipič Bogdan ................................................................................................................................................... 22

An Elderly-Care System Based on Sound Analysis / Frešer Martin, Košir Igor, Mirchevska Violeta,

Luštrek Mitja ..................................................................................................................................................... 26

Are Humans Getting Smarter due to AI? / Gams Matjaž ..................................................................................... 30

Developing a Sensor Firmware Application for Real-Life Usage / Gjoreski Hristijan, Luštrek Mitja, Gams Matjaž ............................................................................................................................................................... 34

Automatic Recognition of Emotions From Speech / Gjoreski Martin, Gjoreski Hristijan, Kulakov Andrea .......... 38

Qualcomm Tricorder Xprize Final Round: A Review / Gradišek Anton, Somrak Maja, Luštrek Mitja, Gams

Matjaž ............................................................................................................................................................... 42

Avtomatizacija izgradnje baze odgovorov virtualnega asistenta za občine in društva / Jovan Leon Noe,

Nikič Svetlana, Kužnar Damjan, Gams Matjaž ................................................................................................ 46

Inferring Models for Subsystems Based on Real World Traces / Kerkhoff Rutger, Tavčar Aleš, Kaluža

Boštjan .............................................................................................................................................................. 50

Inclusion of Visual y Impaired in Graphical User Interface Design / Konecki Mario ............................................. 54

Mining Telemonitoring Data from Congestive-Heart-Failure Patients / Luštrek Mitja, Somrak Maja ................... 58

Approximating Dex Utility Functions With Methods UTA And ACUTA / Mihelčić Matej, Bohanec Marko ........... 62

Comparing Random Forest and Gaussian Process Modeling in the GP-Demo Algorithm / Mlakar Miha,

Tušar Tea, Filipič Bogdan ................................................................................................................................ 66

Comprehensibility of Classification Trees – Survey Design / Piltaver Rok, Luštrek Mitja, Gams Matjaž, Martinčič Ipšić Sandra ...................................................................................................................................... 70

Pametno vodenje sistemov v stavbah s strojnim učenjem in večkriterijsko optimizacijo / Piltaver Rok,

Tušar Tea, Tavčar Aleš, Ambrožič Nejc, Šef Tomaž, Gams Matjaž, Filipič Bogdan ....................................... 74

Determination of Classification Parameters of Barley Seeds Mixed with Wheat Seeds by Using ANN /

Sabancı Kadir, Aydın Cevat ............................................................................................................................. 78

Novi Govorec: naravno zveneč korpusni sintetizator slovenskega govora / Šef Tomaž ..................................... 81

Cloud-Based Recommendation System for E-Commerce / Slapničar Gašper, Kaluža Boštjan .......................... 85

Novel Image Processing Method in Entomology / Tashkoski Martin, Madevska Bogdanova Ana ...................... 89

Arhitektura sistema OpUS / Tavčar Aleš, Šorn Jure, Tušar Tea, Šef Tomaž, Gams Matjaž ............................... 93

Predictive Process-Based Modeling of Aquatic Ecosystems / Vidmar Nina, Simidjievski Nikola, Džeroski

Sašo.................................................................................................................................................................. 97

Recognition of Bumblebee Species by their Buzzing Sound / Yusupov Mukhiddin, Luštrek Mitja, Grad

Janez, Gams Matjaž ....................................................................................................................................... 102

The Las Vegas Method of Paral elization / Zavalnij Bogdan .............................................................................. 105

Resource-Demand Management in Smart City / Zupančič Jernej, Kužnar Damjan, Kaluža Boštjan, Gams

Matjaž ............................................................................................................................................................. 109

Indeks avtorjev / Author index .............................................................................................................................. 113





v



vi



Zbornik 17. mednarodne multikonference

INFORMACIJSKA DRUŽBA – IS 2014

Zvezek A



Proceedings of the 17th International Multiconference

INFORMATION SOCIETY – IS 2014





Volume A




Inteligentni sistemi

Intelligent Systems



Uredila / Edited by



Rok Piltaver, Matjaž Gams





http://is.ijs.si





7. - 8. oktober 2014 / October 7th - 8th, 2014

Ljubljana, Slovenia

1



2

PREDGOVOR



Konferenca Inteligentni sistemi je od leta 1997 naprej sestavni del multikonference

Informacijska družba. Poglavitne teme so inteligentni sistemi in inteligentne storitve

informacijske družbe, oz. programski sistemi informacijske družbe, tehnične rešitve v

inteligentnih sistemih, možnosti njihove praktične uporabe, pa tudi trendi, perspektive, nujni

ukrepi, prednosti in slabosti, priložnosti in nevarnosti, ki jih v informacijsko družbo prinašajo

inteligentni sistemi.



V 2014 se nesluten razvoj informacijske družbe in zlasti umetne inteligence nadaljuje s

čedalje hitrejšim tempom. V nekaterih državah po svetu že vozijo avtonomni avtomobili, npr.

»Google car«. Nekoč utopične ideje Raya Kurtzweila o točki singularnosti in preskoku v novo

človeško ero, se tako zdijo čedalje bližje. Hkrati pa se razlike med ljudmi povečujejo in nihče

prav dobro ne razume družbenih sprememb, ki smo jim priča.



Tudi letos konferenca Inteligentni sistemi sestoji iz mednarodnega dela in delavnice;

prispevki so tako v slovenskem kot angleškem jeziku. Sprejetih je več kot 25 prispevkov, ki

so bili recenzirani s strani vsaj dveh anonimnih recenzentov, avtorji pa so jih popravili po navodilih recenzentov. Večina prispevkov obravnava raziskovalne dosežke Odseka za

inteligentne sisteme Instituta »Jožef Stefan«. Hkrati s predstavitvijo poteka tudi aktivna

analiza prispevkov vsakega predavatelja in diskusija o bodočih raziskavah.



Rok Piltaver in Matjaž Gams, predsednika konference





PREFACE


The Intelligent Systems conference remains one of the fundamental parts of the

multiconference Information Society since its beginnings in 1997. The conference addresses

important aspects of information society: intelligent computer-based systems and the

corresponding intelligent services, technical aspects of intelligent systems, their practical applications, as well as trends, perspectives, advantage and disadvantages, opportunities and

threats that are being brought by intelligent systems into the information society.



As a trend, the progress in information society and intelligent systems increases further in recent years. For example, some countries already enacted autonomous car driving. Once

regarded as utopist, the ideas of Ray Kurtzweil that the human civilization will embrace a new, intelligent era, are becoming widely accepted. At the same time, it seems that nobody fully understands the emerging changes in human society.



The conference consists of an international event and a workshop, and presents over 25

papers written in both English and Slovenian language. The papers have been reviewed by at

least two anonymous reviewers and the authors have modified their papers according to the

remarks. Papers from the Jozef Stefan Institute - Department of Intelligent Systems are

presented separately. Each presentation consists of the classical paper report, and further includes analysis of researcher’s achievements and future research plans of each presenter.



Rok Piltaver and Matjaž Gams, Conference Chairs

3



4

MULTIOBJECTIVE OPTIMISATION OF WATER HEATER

SCHEDULING

Jure Brence1∗, Žiga Gosar1∗, Vid Seražin1∗, Jernej Zupančič2, Matjaž Gams2

1Faculty of mathematics and physics, University of Ljubljana, Jadranska ulica 19, 1000 Ljubljana

2Department of Intelligent Systems, Jozef Stefan Institute, Jamova cesta 39, 1000 Ljubljana

e-mail: {jure.brence, ziga.gosar, vid.serazin}@student.fmf.uni-lj.si, {jernej.zupancic,

matjaz.gams}@ijs.si

ABSTRACT

is presented, which addresses the extraction of household wa-

ter usage patterns with the goal of peak-shaving and reducing

In this paper we present our work on the optimisation

the load on the power-grid. In [3] similar goals are addressed,

of water heater scheduling. The goal is to develop intelli-

while approaching the problem from a different angle, util-

gent strategies for controlling the electric heater and heat

ising fuzzy logic to control electric water heaters. Similarly,

pump in commercial combined water heaters.

Strate-

in [4] the focus is on a solution that decreases peak load on

gies try to find the best compromise between comfort and

the grid by scheduling heating outside peak hours. In [5] a

price, based only on information about the temperature of

simulation platform to model electric water heaters and test

water in the reservoir. A simulation and testing environ-

demand response control strategies in a smart grid is intro-

ment has been implemented to compare the performance

duced.

of existing and new strategies.

3

THE PROBLEM

1





INTRODUCTION


The aim is to develop intelligent strategies for the scheduling

Hot water heating is the biggest component of electricity con-

of water heating. There are several types of water heaters on

sumption in residential homes, contributing as much as 20%

the market, the difference being their source of energy. The

to the total electricity consumption in an average Slovenian

most interesting are combined water heaters that have both an

household [11]. Water heater manufacturers continually de-

electric heater and a heat pump at its disposal. The control

velop improvements to the mechanical aspects of water heat-

unit of a combined water heater is able to control the different

ing. However, the potential for savings by smarter power

heaters separately. At any given moment the controller de-

scheduling is quite unexplored. Most water heater controllers

cides whether a heater is to be turned off or on. Water heaters

tend to keep water temperature at pre-set levels throughout

typically have a single thermometer installed, usually on the

the day, with the exception of user-defined schedules. This

top of the water reservoir. This measurement is the only in-

results in increased heat loss and, more importantly, bigger

formation an intelligent controller gets about the state of the

loads on the power grid during peak hours. An intelligent

water in the reservoir and the consumption habits of the users.

controller would be able to find and optimised schedule of

The development of intelligent strategies is a multi-

water heating, customised for the habits and wishes of users.

objective optimisation problem. The first objective is the elec-

It is important not only to minimise the price of heating, but

tricity cost and the second is some measure of discomfort of

to do so with a minimal increase in user discomfort level.

the users. Any strategy will have to be a trade-off between the

two. Our solution will be a set of strategies, among which the

2

RELATED WORK

user will be able to choose the one with the desired trade-off

between price and comfort.

Some research on the topic of electric water heaters has al-

ready been done. All stated sources are dealing with devices

using only an electric heater, whereas our research focuses on

4

STRATEGIES

combined devices. Much of existing work perceives user dis-

We have implemented a number of different strategies. Each

comfort as a constraint, rarely incorporating it as one of the

falls into one of two categories that differ by the information

objectives.

that is available to the controller.

In [1] solutions are provided for an electric water heater

The simplest strategies are static strategies that use only

that is connected to an electrical grid where the electricity tar-

predefined settings and current measurements. These could

iff is dynamically changed in real time, and mainly focuses

be date, time, temperature and the temperature in the previ-

on optimisation in regard to this tariff system. In [2] a model

ous minute. Static strategies follow a predefined set of rules.

∗These authors equally contributed to the paper.

While they do not learn or modify their behaviour, different

5

rule-sets may be defined for different periods of the day, or

achieved with the heat pump alone, Bulk begins utilis-

days in the week. Some static strategies:

ing the electric water heater. This way, any heating is

done directly before water consumption. It can also be

1. On-Off Control (lower T, upper T, electric heater, heat

modified to heat during the lower price tariff to accumu-

pump) is the strategy used in most commercial water

late heat. This way Bulk produces a result with minimal

heaters.

Sometimes called Bang-Bang Control.

The

discomfort at an almost minimal price.

boolean constants electric heater and heat pump spec-

ify if the strategy is allowed to use electric heater and

There is also a third category of strategies that learn from

heat pump. When the temperature drops below lowerT

the past and adjust their decision-making to best fit the user

all available heat sources are turned on until upperT tem-

habits. This kind of strategies are the final goal of our re-

perature is reached.

search.

2. Intervals (list of intervals with appropriate strategies)

5

METHODS

uses different strategies in different parts of the day

(e.g. when electricity is cheaper or when the user ex-

The basic method applied in this research is the testing and

pects higher water consumption). At initialisation we

comparison of various scheduling strategies. We utilise com-

can specify any number of intervals and corresponding

puter simulations, as running these tests on real water heaters

strategies. One example is a sub-strategy called Heat

would require a lot of time and resources, which we do not

Less at Noon which uses On-Off Control(40, 41, False,

have at our disposal. To this purpose, we have developed a

True) between 9 am and 3 pm and On-Off Control(45,

water heater simulator and a water consumption simulator.

50, False, True) at other times. This strategy is similar

up to constants values to some real strategies consumers

5.1

Water heater simulation

use.

Real specifications [9, 10] of commercial water heaters were

3. New On-Off (lower boundary for electric heater, lower

used, namely: dimensions of the reservoir, power of heaters,

boundary for heat pump). When the temperature drops

coefficient of performance (COP) of the heat pump, thermal

below the predefined lower boundary for electric heater

conductivity of the insulation and maximum flow rate. Typ-

the electric heater turns on until the temperature is higher

ical water heaters are shaped cylindrically, with cold water

than this boundary. The heat pump works on the same

entering the reservoir at the bottom and hot water leaving on

principle but with a different boundary temperature,

the top. The position of the heating element varies with the

which is usually higher.

model. Some manufacturers choose to position the heater at

the bottom, to encourage the convection of hot water, oth-

4. Rules Z. A day is divided in N regions that are set by

ers attempt to heat uniformly along the vertical axis, or some

the user. In each region a set of boundary temperatures,

other option. In current tests water is heated uniformly. Com-

as well as boundary temperature changes is defined. The

bined water heaters have two types of heaters: electric heater

two different heaters are turned on or off based on the

and heat pump. With the electric heater, the thermal power

boundary conditions for the current region.

it produces is equal or close to equal to the electric power

it consumes. As such, its heating power is fixed. The heat

Oracle strategies are given the future water consumption

pump, on the other hand, produces more thermal energy than

schedule that they use to calculate the plan of how and when

the amount of electric energy it uses. The ratio between the

they will heat the water. We use these strategies to get the

two – COP – typically falls into a range from 2 to 5.5. The

best trade-off between discomfort and price. There is no other

COP of a heat pump depends on the temperature of the heat

strategy with a strictly better performance in both objectives.

source, often the outside air, and the temperature of water.

1. Brute Force makes decisions at discrete time intervals

During the heating process, as water temperature increases,

of predefined length, usually 1 or 10 minutes. At every

the COP drops.

step four options are available: no heating, only electric

The simulation does not attempt to simulate the complex

heater, only heat pump, or both. Brute force simulates

thermodynamics and fluid mechanics happening in the wa-

every possibility, looking for the optimal one. Theoret-

ter heater. It rather uses a simplified model that manages to

ically every possible strategy would be tested by Brute

emulate the responses of the built-in thermometer to various

Force, allowing us to find the true Pareto front. This

inputs. The water in the reservoir is divided into 20 layers

approach is not practical due to its computational ineffi-

along the vertical axis. All the water in one layer has the

ciency.

same temperature. The water heater is simulated with a one

minute step. Each step, energy losses are calculated for each

2. Bulk starts with a decision to never heat. It simulates

layer, taking into account the water temperature, outside tem-

the water heater until it reaches discomfort.

Then it

perature and the thermal conductivity of the container walls.

starts rewinding time back and turning the heat pump

Heat exchange between neighbouring layers is simulated with

on until the discomfort reaches zero. If this cannot be

an experimentally set heat transfer coefficient to match real

6



data. When heating is turned on, each layer receives its share

corresponding temperature. The total discomfort of a mea-

of thermal energy. As the simulator receives a request for

surement is defined as the sum of individual discomforts.

hot water, it removes the appropriate volume of water from

The calculations of price have to take into account different

the top layers and adds cold water layers at the bottom. The

price tariffs. The majority of Slovenian electricity providers

number of layers and their individual size is kept in check by

use a two-tariff system, with the lower tariff from 22.00 to

joining neighbouring layers with similar temperatures. Man-

6.00 and during weekends and the higher tariff from 6.00 to

ufacturers usually take special care to minimise the mixing of

22.00 during week days. The prices vary between suppliers.

water in the reservoir. This provides the consumer with a bet-

We use a lower tariff price of 0.04320 e/kWh and a higher

ter experience that is, the outgoing water stays at the almost

tariff price of 0.07795 e/kWh [11].

the same temperature during a shower, unless all hot water is

used up. Our simulator is able to reproduce this behaviour to

a sufficient degree.

6

TESTING AND RESULTS

At the beginning of the test, the generator of water consump-

5.2

Water consumption simulation

tion produces a semi-random plan of consumption. The water

heater is simulated with a one minute step for the specified

A number of sources [6, 7, 8] for water consumption measure-

duration of the experiment, usually several weeks. Water is

ment were used to develop a simulation of water consumption

used according to the schedule, while the heating of the water

during weekdays and weekends. When a household with a

heater is controlled by the tested strategy. The process is re-

specified number of members is generated, each individual

peated for other strategies using the same consumption sched-

is assigned a semi-random consumption pattern. A specific

ule. The whole experiment is ran multiple times with different

consumption schedule is then generated based on the patterns

consumption schedules. The result of the experiment are the

of individuals, with added variance using Gaussian distribu-

average price and discomfort for each of the tested strategies

tions. We separate two types of events. 3 to 10 small events

(Figure 2).

(e.g. washing hands) per user are randomly scattered through-

As anticipated, Bulk achieves the best comfort, which is

out the day, taking one minute and using less than 1 litre of

usually near zero. Other strategies with comparable comfort

hot water. Large events (showers) happen 1 to 3 times per day

achieve it at a much higher price. A generally best perfor-

per user, and take between 5 and 30 minutes, with 1-6 litres

mance is achieved by On-Off Control, using only the heat

of water per minute at a mean 38◦C.

pump heater. Most of our static strategies are dominated by

Both real and simulated water consumptions vary greatly

On-Off Control and Bulk.

on a day to day basis. Figure 1 shows the distribution of sim-

Each strategy has a number of parameters that can be var-

ulated hot water consumption over a longer time period.

ied to achieve different results. By varying the boundaries of

On-Off Control we produce three fronts, one for each type of

heating (figure 2). Varying the parameters of Oracle strategies

would produce another front. Ideal solutions would dominate

On-Off Control – type strategies, while being dominated by

Oracle strategies.

7

Conclusion

The project is aiming to develop intelligent strategies for the

scheduling of water heating in commercial water heaters. So

Figure 1: Simulated hot water consumption, averaged over

far we have developed a complete testing environment for

50 week days for 100 different households.

comparing different strategies. We have implemented and

tested most commercially used strategies. For comparison we

have also implemented one Oracle strategy that achieves the

best comfort possible.

5.3

Discomfort and price

In order to find a good approximation of the Pareto front

Different strategies under different water consumption pro-

for our consumption simulator, we intend to develop a set of

files were evaluated using discomfort and price as criteria.

Oracle strategies that are capable of achieving a specific trade-

Discomfort for a minute of our simulation is defined as:

off of price and comfort.

We also plan to utilise evolutionary algorithms to optimise

0

if To ≥ Tr

the various static strategies. Finally, we want to develop in-

discomf ort =

(1)

(Tr−To)∗V

if T

telligent strategies that adapt by learning from the past.

1000

o < Tr

An immediate application of our system is to provide the

where Tr is the requested temperature by the user, To is the

user with a more intuitive way of choosing the most appropri-

outflow temperature and V is the volume of water with the

ate strategy. As it stands, users manually choose the On-Off

7



Figure 2: Averaged price and discomfort for a number of static strategies and Bulk. The colours dark cyan, blue and green represent static strategies using only the electric heater, only the heat pump and both, respectively. Strategies coloured orange are variations of Intervals. Other static strategies are coloured purple. Oracle strategies are black. Household with 4 members, using a 230 L water heater with a 1500 W electric heater and a heat pump with a heating power of 2000 W and a COP of 3.3 at 35oC.

Control settings, which is generally the preferred water tem-

Strategies for Demand Response. In Power and Energy

perature. With our simulation, users would only need to de-

Society General Meeting, IEEE (2012): 1-8.

cide on a price to comfort trade-off, and the controller would

choose the best strategy and settings to improve their comfort

[6] Energy Monitoring Company, Energy Saving Trust.

while lowering their costs.

Measurement of Domestic Hot Water Consumption in

Dwellings (2008).

References

[7] B. Schoenbauer, D. Bohac, M. Hewett. Measured Resi-

dential Hot Water End Use. ASHRAE Transactions 118

[1] P. Du, N. Lu., Appliance Commitment for Household

(2012): 872-889.

Load Scheduling.

Smart Grid, IEEE Transactions on

Smart Grid 2, no. 2 (2011): 411-419.

[8] Equipment Energy Efficiency. Water Heating Data Col-

lection and Analysis, 2012.

[2] L. Paull, H. Li, L. Chang. A novel domestic electric water

heater model for a multi-objective demand side manage-

[9] Coolwex. DSW 300 - Navodila za inštalacijo in uporabo.

ment program. Electric Power Systems Research 80, no.

Klima center Horizont, Maribor, 2010.

12 (2010): 1446-1451.

[10] Kronoterm. Katalog produktov, 2014. URL http://

[3] B. J. LaMeres, M. H. Nehrir, V. Gerez Controlling the

www.kronoterm.com/wp-content/uploads/

average residential electric water heater power demand

2014/katalog-40-mar-2014-web.pdf.

Ac-

using fuzzy logic. Electric Power Systems Research 52,

quired 20.8.2014.

no. 3 (1999): 267-271.

[11] Statistični urad Republike Slovenije. Podatkovni por-

[4] A. Moreau. Control Strategy for Domestic Water Heaters

tal SI-STAT, Okolje in naravni viri, Seznam tabel.

during Peak Periods and its Impact on the Demand for

URL http://pxweb.stat.si/pxweb/Dialog/

Electricity. Energy Procedia 12 (2011): 1074-1082.

statfile2.asp

[5] R. Diao, S. Lu, M. Elizondo, E. Mayhorn, Y. Zhang,

N. Samaan. Electric Water Heater Modeling and Control

8



ANALIZA NAKUPOV IN MODELIRANJE POSPEŠEVANJA

PRODAJE V SPLETNI TRGOVINI



Matija Č erne (Fakulteta za matematiko in fiziko,

Jadranska 19, 1000 Ljubljana, Slovenija),

Boštjan Kaluža, Mitja Luštrek

Odsek za inteligentne sisteme,

Inštitut Jožef Stefan

Jamova cesta 39, 1000 Ljubljana, Slovenija

Tel: +386 1 4773419; fax: +386 1 4251038

e-mail: matija.cerne@student.fmf.uni-lj.si

bostjan.kaluza@ijs.si

mitja.lustrek@ijs.si





POVZETEK

Večina analize je temeljila na dveh datotekah, katerih izseka



Analizirali smo podatke o nakupih v spletni trgovini.

vidimo spodaj:

Cilja sta bila ugotoviti u



činek spremembe cene na

potrošnjo in identifikacija potrošnikovih preferenc v



šifra nakupa

šifra izdelka

količina

cena

opis izdelka

nekem trenutku. Pri analizi smo uporabljali tako

1

349908

150502

1

1.49

Dzem Eta. 450g

mikroekonomske kot tudi statistične pristope. V grobem

2

386589

150502

1

1.49

Dzem Eta. 450g

lahko metode analize razdelimo na dva sklopa – tiste, ki

3

384333

150502

1

1.49

Dzem Eta. 450g

se osredotočajo na uporabnika in tiste, pri katerih je

4

350190

150502

1

1.49

Dzem Eta. 450g

pomemben le artikel.

5

350564

150502

1

1.49

Dzem Eta. 450g



6

350550

150502

1

1.49

Dzem Eta. 450g

1 UVOD



7

344657

150507

1

0.34

Sol Morska 1kg

Zanimajo so nas predvsem rezultati, ki bi jih lahko uporabili

8

341269

150507

1

0.34

Sol Morska 1kg

za priporočanje artiklov tako znanim (obstoječim), kot tudi

9

341373

150507

1

0.34

Sol Morska 1kg

neznanim (novim) uporabnikom. Pri priporočanju gre za to,

da čimbolj natančno ugotovimo, kateri izdelek bi nekega

10

345727

150507

1

0.34

Sol Morska 1kg

uporabnika poleg kupljenih še utegnil zanimati, nato pa mu

Podatki o nakupih (izsek). Št.vrstic: 15176

ta izdelek spletna prodajalna priporoči. Iz prodajalčevega



vidika je to precej pomembno orodje pospeševanja prodaje,

znesek



šifra nakupa

šifra uporabnika

datum nakupa

naročila

še posebej v kontekstu spletne trgovine. V nasprotju s

1

334366

127348

3.47

2012-07-24

klasično trgovino lahko tu v vsakem trenutku vidimo

uporabnikovo košarico, pa tudi uporabnika samega lahko

2

335402

37507

8.05

2012-07-27

identificiramo, kar v praksi (za ne-uporabnike raznih kartic

3

336527

248562

30.94

2012-08-02

zvestobe) ni izvedljivo. Priporočila se na spletni strani

4

336934

248562

1.49

2012-08-06

izvedejo v obliki seznama priporočenih izdelkov, kar je

5

337402

248562

1.34

2012-08-07

izvedljivo v realnem času, če smo podatke predhodno

6

337404

37507

9.16

2012-08-08

pravilno obdelali.

7

337634

249741

8.29

2012-08-08



8

337643

249741

100.58

2012-08-08

2 PODATKI



9

337648

248562

2.29

2012-08-08

Na voljo smo imeli podatke o vseh nakupih, ki so se v

10

337663

248562

17.24

2012-08-08

spletni trgovini zgodili med 24. Julijem 2012 in 15.

Podatki o prodanih izdelkih (izsek). Št. vrstic: 347332

Januarjem 2014. Za posamezne izdelke tako vemo kdaj,



koliko in po kakšni ceni so bili prodani. Ob kasnejši

Poleg tega smo imeli tudi podatke o uporabnikih, ki so

obdelavi smo sicer ugotovili, da obstaja možnost, da

povedali ali je posamezen uporabnik fizična oseba ali

določeni podatki manjkajo (predvsem na začetku obdobja),

podjetje, ter poštno številko njegovega prebivališča. Ker je

vendar je to upoštevano v analizi oziroma pri rezultatih.



9





cilj projekta priporočanje produktov fizičnim uporabnikom,

2.3 Obdelava podatkov

smo se odločili da bomo obravnavali samo fizične osebe. Če

Za nadaljno uporabo je bilo potrebno združiti podatke o

bi obravnavali podatke obeh kategorij skupaj, bi zaradi

naročenih izdelkih in uporabnikih – torej pogledati, kateri

velike razlike v obsegu potrošnje, pa tudi zaradi specifičnih

‘order ID-ji’ pripadajo kateremu uporabniku in nato za

potreb podjetnikov ki se navadno razlikujejo od potreb

posamezno naročilo (order) združiti izdelke ki so bili

'običajnih' potrošnikov, verjetno dobili precej nezanesljive

kupljeni.

rezultate. Zato smo najprej iz podatkov o nakupih izločili

Za potrebe cenovne analize je bilo potrebno podatke

tiste, ki so jih opravila podjetja. Glede lokacije uporabnikov

transformirati v takšno obliko, da lahko razberemo

se nam v dosedanji analizi to ni zdel dovolj pomemben

informacijo o potrošnji ob določeni ceni. Natančneje,

dejavnik pri potrošnikovih odločitvah in temu nismo

potrebovali smo neko mero za ‘moč potrošnje’ v določenem

posvečali posebne pozornosti.

cenovnem

obdobju

(torej

obdobju

med

dvema

Na voljo smo imeli še podatke o lastnostih izdelkov, vendar

spremembama cene) in najbolj logična mera se je zdela

teh podatkov nismo obravnavali.

frekvenca nakupov (enota: št.izdelkov/dan):

Ena od težav je bila ta, da nismo imeli točnih podatkov o



datumih sprememb cen in smo tako datum spremembe

morali aproksimirati z datumom, ko se je prvič zgodil nakup



po novi ceni. To pa za izdelke, ki se ne kupujejo vsak dan



(in takšnih je večina) pomeni, da se obdobja ko neka cena

Tu je sicer nastopil problem določitve obdobij ko velja neka

velja, lahko precej razlikujejo od resničnih obdobij. Ravno

cena, saj kot smo že prej omenili, nimamo točnih datumov

zaradi tega, pa tudi zaradi premajhne količine podatkov (kar

sprememb. Problematični so bili predvsem primeri, ko se je

bi imelo za posledico premalo zanesljive rezultate) smo se

nek nakup zgodil po ceni pred spremembo, vendar je bil

odločili, da v analizah, kjer je to pomembno, obravnavamo

zabeležen datum komaj v obdobju, ko je veljala naslednja

samo določeno število izdelkov, za katere imamo dovolj

cena – tako se je večkrat zgodilo tudi, da smo imeli na isti

podatkov.

dan iste artikle prodane po različnih cenah. Možna razlaga



za to je, da se je cena izdelka zabeležila ob izdaji računa,

2.2 Vizualizacija podatkov

datum nakupa pa je obveljal kot datum plačila – ni namreč

Graf prikazuje, kako so porazdeljeni uporabniki glede na

nujno, da je bil račun takoj plačan. Kakorkoli, v takšnih

število nakupov, ki jih opravijo (horizontalna os) in

primerih je bilo potrebno ‘izravnati šum’ in naročila s staro

povprečno vrednost nakupa (vertikalna os). Vsaka pika

ceno postaviti v prejšnje obdobje, sicer bi imeli ob

predstavlja enega uporabnika:

nekaterih spremembah cene lahko hude distorzije v



frekvenci nakupov. To je bilo (za nekatere obravnavane

izdelke) narejeno kar ročno, saj bi bilo sicer pretežko dovolj

dobro definirati, katerim naročilom je potrebno spremeniti

datum.



SLIKA 1: porazdelitev uporabnikov spletne trgovine



Opazimo, da večina uporabnikov za svoj nakup zapravi

okoli 70 eur, in v obravnavanem obdobju manj kot

petnajstkrat nakupuje v spletni trgovini.



SLIKA 2: Graf frekvenc nakupov za izdelek ‘Mineralna



voda Radenska classic, kombiniran z grafom cen





10





3. ANALIZA - METODE IN REZULTATI

vendar vseeno vsaj okvirno vidimo, kateri izdelki so bolj,



Najprej smo analizirali potrošnjo v odvisnosti od cen

kateri pa manj občutljivi na spremembe v ceni (v tabeli

(cenovna analiza), nato pa še analizirali nakupovalne

izgleda da je Alpsko 3,5 najbolj, 1,6 pa najmanj cenovno

navade desetih najbolj zanimivih uporabnikov.

stabilno). Vseeno ta metoda ni najbolj zanesljiva za



napovedovanje potrošnje, saj lahko predvidevamo, da so

3.1 Cenovna analiza

spremembe odvisne tudi od številnih drugih dejavnikov (npr



Pri cenovni analizi raziskujemo, kako se potrošnja

oglaševanje, substituti iz drugih trgovin, substituti ki jih

(frekvenca nakupov) spreminja v odvisnosti od spremembe

nismo upoštevali pri analizi, šum na podatkih, …). V tabeli

v ceni. Z uporabo mikroekonomskega pojma cenovne

je to najbolj vidno pri Alpskem mleku 3,5, kjer izgleda da

elasti

je 1% spremembe v ceni prinesel 2,42% spremembe

čnosti smo poskušali oceniti vpliv spremembe cene na

potrošnjo istega oziroma sorodnih izdelkov. Nato nas

(pozitivne!) v potrošnji. Ko pogledamo na graf frekvenc, pa

zanima tudi, kaj se dogaja ob specifi

opazimo, da je potrošnja dobrega pol leta od začetka

čni kratkotrajni

spremembi cene – akciji.

merjenja skoraj nič, torej se lahko upravičeno vprašamo, ali



je to res (kar je malo verjetno, glede na to da gre za enega

3.1.2 Cenovna elasti

najbolj prodajanih artiklov za katerega dobro vemo, da ni

č nost

Za u

prišel v prodajo komaj pred enim letom) in če se je mogoče

činek sprememb cene na potrošnjo istega izdelka

cenovno elasti

zgodila napaka pri knjiženju naročil – recimo če se je vmes

čnost izračunamo tako:

zamenjala koda izdelka in to ni bilo popravljeno v bazi

podatkov.





Za učinek spremembe cene nekega izdelka na potrošnjo

3.1.2 Analiza uč inkov akcij

nekega drugega izdelka (substituta) potrebujemo koeficient

Ob opazovanju grafov prometa (prihodki od prodaje na dan;

križne elastičnosti. Ta nam pove, za koliko odstotkov se

cena krat frekvenca) in cen v času za nekatere izdelke

spremeni potrošnja dobrine B ob spremembi cene dobrine

opazimo, da za kratkotrajne padce v ceni (akcije) frekvenca

A za en odsototek:

potrošnje za to obdobje naraste, kar seveda ni presenetljivo.

Zanimivo pa je dejstvo, da se velikokrat potrošnja po koncu

akcije (vrnitvi cene na isto ali višjo raven kot prej) ne vrne



V prvem primeru pričakujemo, da bo vrednost negativna

na raven pred akcijo, temveč ostane višja kot je bila tedaj.

(če se cena poveča, se troši manj nekega izdelka),

Ob tem velja poudariti, da na potrošnjo poleg samega

posledično pa pri sorodnih izdelkih (komplementih)

znižanja cene gotovo vpliva tudi to, da se ob akciji tudi

pričakujemo, da se bo potrošnja ob nespremenjeni ceni

poveča promocija za izdelek (npr. objava v katalogu,

povečala. Za potrebe računanja križnih elastičnosti je bilo

reklama po televiziji). Poglejmo ta efekt na grafu za ‘Mleko

potrebno še enkrat naračunati frekvence nakupov (q), tokrat

Lejko 1,5%’ :

po datumih sprememb cene vseh ostalih izdelkov, ki jih



opazujemo skupaj. Spremembe se seveda ne zgodijo samo

enkrat, zato ob vsaki spremembi cene lahko izračunamo

novo cenovno elastičnost (tako enostavno kot križno).

Rezultate nato lahko aproksimiramo regresijsko ali pa

izračunamo povprečje. Zaradi velike distorziranosti

podatkov se je druga metoda izkazala za bolj primerno.

V naslednji tabeli so predstavljeni rezultati (povprečne

elastičnosti) za skupino substitutov ‘Mleka’, kjer so bili

rezultati še najbolj skladni s pričakovanji:





Po diagonali so vrednosti izračunane po prvi formuli, na

SLIKA 3: Padec cene (akcija) je označ en z rdeč o elipso,

promet pred in po akciji pa z modrima

ostalih mestih pa po principu križne elasti

č rtama

čnosti. Razlaga



vrednosti v tabeli (gledamo zadnjo vrstico). Če se cena

mleka znamke 1,6 poveča za 1%, tedaj se potrošnja (število

Pri obravnavanih izdelkih (100 artiklov z najvišjo prodajo)

prodanih artiklov na dan) mlek 3,5, Alpsko 1,6 in Alpsko

se je ta situacija večkrat ponovila. Za definirano akcijo (v

3,5 po vrsti poveča za 1,36 %, 1,1 %, in 1,06 %. Obenem se

našem primeru je akcija definirana kot vsaka sprememba

potrošnja mleka zmanjša za 1,02 %.

cene za vsaj -4% in v trajanju največ tri tedne) smo izsledke

Še vedno seveda ne moremo z gotovostjo trditi, da se bo

predstavili v tabeli, kjer smo izračunali, kakšna je

potrošnja spreminjala točno tako kot so vrednosti v tabeli,

procentualna sprememba v prometu med (2. stolpec) in po



11

akciji (3. stolpec). Za izdelke, ki so imeli več akcij, smo



izračunali povprečno spremembo, lahko pa bi uporabili tudi

3.2.2 Ciklič na potrošnja izdelkov

kakšno drugo metodo aproksimacije – recimo z metodo

Za izdelke, ki jih obravnavani uporabnik dovolj pogosto

najmanjših kvadratov v tridimenzionalnem prostoru.

kupuje, poskušamo ugotoviti, ali jih kupuje v časovnih

Rezultati so predstavljeni v TABELI 1: Uč inki akcij (ki se

intervalih in le-te identificirati. Tudi ta problem je občutljiv

nahaja v Dodatku), vse vrednosti pa so v odstotkih.

na število nakupov. Uporabna vrednost te informacije je v

Tabela je urejena po zadjem stolpcu, torej nam pove, katere

tem, da lahko v danem trenutku predvidimo, ali se bo zgodil

izdelke (izmed 100 obravnavanih) se najbolj splača

nakup nekega izdelka s strani obravnavanega uporabnika,

postaviti v akcijo, če želimo pozitiven efekt na promet tudi

ali ne.

po akciji. Podatki (predvsem prvih nekaj) izgledajo precej

Uporabimo statistični pristop – iščemo interval zaupanja, v

nerealni in tako ekstremne vrednosti lahko pripišemo

katerem bi se z neko verjetnostjo zgodil naslednji nakup. Ta

motnjam pri podatkih. Vseeno lahko opazimo, da se v

nam za določeno stopnjo (med 0 in 1) in ocene parametrov

splošnem akcija prodajalcu iz vidika povečevanja prometa

pove meje intervala, v katerem se nahaja neka slučajna

splača – prvo se mu poveča promet zaradi povečane

spremenljivka (naslednji nakup), ki je porazdeljena isto kot

potrošnje, potem pa zaradi kombinacije povečane potrošnje

so porazdeljeni podatki. Parametri so: perioda (povprečen

in (ponovnega) dviga cene. Vendar pa lahko predvidevamo,

čas, ki mine med dvema nakupoma določenega izdelka),

da ob akciji zaradi učinka substitucije povzročimo padec

standardni odklon (pove, kako močno varirajo časi med

prometa za druge, podobne izdelke.

nakupi), in datum zadnjega nakupa.



Če privzamemo, da se trenutno nahajamo v času 2014-01-

3.2 Analiza nakupov uporabnikov

16 (prvi dan, za katerega nimamo več podatkov)

Izbrali smo deset uporabnikov z največ nakupi, saj nam to

predvidevamo, da bo uporabnik, v kolikor na ta dan opravi

zagotavlja dovolj veliko količino podatkov za analizo

nakup, kupil izdelke, ki so v TABELI 2: ciklič nost

vsakega posebej. Iste metode kot so predstavljene v tem

potrošnje ki se nahaja v Dodatku, obarvani rumeno (za te

razdelku seveda lahko uporabimo tudi pri uporabnikih z

izdelke je ‘trenutni’ datum 2014-01-16 znotraj intervala).

manj nakupi, vendar se s tem (za nekatere metode) znatno

Gledamo.

zmanjša točnost napovedi. Za te metode bi bilo zato v

V našem primeru je stopnjo zaupanja 0,9. Za izdelek s šifro

primeru praktične uporabe smiselno določiti neko spodnjo

157869 ("Solata endivija", 2. vrstica v tabeli) bo tako glede

mejo za število nakupov, ki jih je uporabnik že opravil.

na naše podatke veljala napoved, da se bo naslednji nakup z



verjetnostjo 90 % zgodil v obdobju med 16. in 30. 1. 2014.

3.2.1 Identifikacija zaželenih in nezaželenih izdelkov

Pri tem je potrebno poudariti, da bi se v praksi ocene

Radi bi opredelili odnos do izdelkov, ki jih obravnavani

parametrov računale sproti, torej bi se z akumulacijo

uporabnik kupuje. Natančneje, zanima nas, ali obstajajo

podatkov natančnost napovedi povečevala.

izdelki, za katere lahko sklepamo, da jih je uporabnik kupil



le enkrat in nato nikoli več? Takšnih izdelkov potem temu

4. ZAKLJUČEK

in njemu podobnim uporabnikom ne priporo



čamo, saj

Najprej smo opravili cenovno analizo, ki temelji na

predvidevamo da uporabnik z izdelkom ni bil zadovoljen.

podatkih o prodanih izdelkih. Cilj analize je bil predvsem

V Dodatku je izsek grafa (GRAF 1 : nakupi uporabnika), ki

raziskati, kako se potrošnja odziva na spremembe v ceni. Tu

prikazuje nakupe uporabnika. Graf je precej velik

smo ločili splošno obravnavo in obravnavo posebnih

(natančneje, višina je število različnih artiklov, ki jih

sprememb v ceni – akcij. Pridobljeni rezultati so bili v

uporabnik kupi, v konkretnem primeru okoli 1000, dolžina

nekaterih primerih pričakovani, v drugih nekoliko manj.

pa število nakupov (251)). V vrsticah so predstavljeni

Nato smo analizirali nakupovalne navade nekaterih

izdelki, pika pa pomeni da je bil nek izdelek kupljen

uporabnikov, kar je uporabno predvsem za potrebe

(nakupi so predstavljeni na ordinatni osi).

priporočanja in je tudi prvotni cilj projekta. Najprej smo se

Verjetno nezaželeni izdelki za obravnavanega uporabnika so

osredotočili na ‘negativno selekcijo’ priporočanja, torej smo

tisti, ki se pojavijo na grafu le enkrat – na izseku so

poskušali identificirati izdelke ki jim bomo dali negativno

obarvani s sivo. Mera gotovosti za to, da smo pravilno

utež. Tu je pomembno, da upoštevamo ‘mero gotovosti’, ki

napovedali ‘nezaželene izdelke’ mora temeljiti na številu

smo jo zaenkrat le opisno opredelili. Nato smo preverili, kaj

nakupov, ki jih uporabnik opravi po tem, ko kupi

lahko predvidimo o času nakupa nekega izdelka in po

‘nezaželen izdelek’ in na tipu izdelka (ali gre za izdelek ki

statistični analizi prišli do zaključka, da za dovolj obsežno

se sicer troši pogosto).

količino podatkov lahko napovemo časovni interval, ko se

Mogoče bi lahko tudi ugotovili, ali je uporabnik izdelek

zgodi naslednji nakup in povedali, kako bi to lahko bilo

zamenjal za nek substitut (temu bi potem ocena

uporabno v smislu priporočanja.

‘zaželenosti’ narasla). Ta problem je sicer zelo občutljiv na

število nakupov.





12





5. DODATEK





TABELA 1: uč inki akcij

TABELA 2: ciklič nost potrošnje



Povprečna

Povprečna

sprememba

sprememba



izdelek

prometa

prometa

med akcijo

po akciji

1

149725 Toaletni papir PALOM

4458.97

9004.5

2

146861 Mleko trajno alpsko,

266.166

1621.48

3

147757 Napitek izotonicni S

591.6133

857.54

4

159161 Jogurt navadni, cvrs

2872.275

786.33

5

150673 Kuhinjske brisace PA

535.74

781.155

6

146485 Voda RADENSKA classi

821.8833

701.22

7

159133 Voda, namiz

199.06

687.22

8

164210 Mleko trajno Zelene

545.92

522.09

9

149226 Cvetaca

43.5433

517.35

10

159129 Voda gazira

-10

421.43

11

149349 Kajzerica 55g

287.75

305.58

12

149998 Banane

191.528

246.56

13

150231 Kruh rzeni

276.896

198.98

14

147837 Jajca l,, 1

330.43

195.88

15

159057 Mleko trajno Zelene

339.415

134.4

16

147266 Mleko trajno lejko,

392.68

125.09

17

151099 Sok lumpi, jabolko,

424.26

79.36

18

146403 Pivo UNION, svetlo,

4.89

59.71

19

159143 Pivo, svetl

182.815

54.505

20

147274 Cokolada GORENJKA, t

52.25

47.84

21

151877 Keksi domacica origi

94.97

36.27

22

148445 Pivo LAsKO CLUB, piv

221.245

26.945

23

151988 Cokolada PR

23.94

16.87

24

149318 Sosedovo pecivo, s s

323.63

4.51

2

5

1

5

1

9

8

5



C

o

k

o

l

a

d

a



P

R



1

7

.

8

6



4

.

2

9





26

159151 Radler gren

1821.7

0.78

27 156492 Cokolada GORENJKA ml 354.97

-21.12



28

151065 Cokolada BALI z rize

-40.19

-22.7

29

164990 Cokolada MILKA noise

70.735

-50.39





GRAF 1: nakupi uporabnika





13



ANALIZA MOŽNOSTI ZAZNAVANJA PODOBNOSTI MED

UPORABNIKI



Božidara Cvetković , Mitja Luštrek

Department of Intelligent Systems

Jožef Stefan Institute

Jamova cesta 39, 1000 Ljubljana, Slovenia

e-mail: {boza.cvetkovic, mitja.lustrek}@ijs.si





POVZETEK

ali neomejena količina neoznačenih podatkov. Najbolj



osnovna metoda je samo-učenje (self-training [1]), ki

Prispevek predstavlja preliminarne rezultate analize

uporablja en klasifikator za označevanje podatkov in ročno

možnosti zaznavanja podobnosti med uporabniki. Cilj

nastavljen prag za odločitev o izbiri podatka za dodajanje v

analize je izbrati najboljši pristop, ki bo uporabljen v

učno množico. Prag je po navadi nastavljen tako, da mora

metodi za prilagajanje modela uporabniku MCAT.

biti zaupanje v napoved 100%. Nadgradnja metode z enim



klasifikatorjem je dodajanje več klasifikatorjev, ki so

1 UVOD IN SORODNO DELO



naučeni z različnimi algoritmi in za dodajanje uporabljajo

V aplikacijah, kjer se uporabljajo modeli strojnega učenja za

večinski glas (Democratic co-learning [2]) ali več

napovedovanje človeškega obnašanja, se pogosto dogaja, da

klasifikatorjev z istim učnim algoritmom in več dimenzijami

točnost delovanja v realnem okolju ni primerljiva točnosti

(Co-training [3]). Pomanjkljivost prvega je v ročno

delovanja v laboratorijskem okolju. Razlog je tako omejena

nastavljenem pragu (100% zaupanje v napoved), problem

količina učnih podatkov, kot tudi fizična razlika ter razlika v

drugega pa kompleksnost delitve prostora na dva

navadah med ljudmi. Fizične razlike se kažejo bodisi v

ortogonalna dela ali dimenziji. Več o metodah pol-

drugačnosti

izvajanja

akcij

v

primeru

problema

nadzorovanega učenja pišemo v našem preteklem delu, kjer

prepoznavanja aktivnosti ali v drugačnem metabolnem

smo prilagajali klasifikator za prepoznavanje aktivnosti

sistemu v primeru problema ocene porabe energije.

novemu uporabniku. Pokazali smo, da lahko z mehanizmom

Točnost modela za določenega uporabnika lahko zvišamo na

za prilagajanje novemu uporabniku (MCAT - Multi-

dva načina:

Classifier Adaptive Training [4]) in omejeno količino na

•

označimo dodatne učne podatke specifične za

novo označenih podatkov (3 aktivnosti po 30 sekund)

novega uporabnika in uporabimo nadzorovano

zvišamo prepoznavanje aktivnosti za približno 12 odstotnih

učenje za nov model ali

točk. Ogrodje MCAT metode je okvirno predstavljena na

•

uporabimo katero od metod, ki nenadzorovano

sliki 1.

ali

pol-nadzorovano

prilagodijo

model



trenutnemu uporabniku.

Neoznačena instanca

Najboljše izboljšanje dobimo z označevanje dodatnih

Učna množica

Učna množica

osnovnega

specifičnega

podatkov. Vendar je ta proces časovno zelo zahteven,

modela

modela

duhamoren in drag, tako za označevalca kot za uporabnika.

Velikokrat se zgodi, da je samo označevanje podatkov v

Osnovni model

Specifični model

ciljnem okolju onemogočeno, bodisi zaradi samega

Posodobi

klasifikacijskega problema (označevanje padcev je lahko

nevarno) ali pa zato, ker nam manjkajo dodatne naprave, ki

Ponovno učenje

Selekcija

niso mobilne in jih lahko uporabljamo izključno v

modela

laboratoriju (poraba človeške energije iz izdihanega zraka).

V tem primeru se izkažejo rešitve, ki uporabljajo pol-

Dodajanje

nadzorovano učenje, bolj primerne. Metode pol-

Označena instanca

nadzorovanega učenja označijo neoznačene podatke in glede



na določeno pravilo izberejo ali zavržejo trenutni podatek za



dodajanje v učno množico. Nad učno množico, ki vsebuje

Slika 1: Ogrodje metode MCAT

nove podatke, se nato uporabi nadzorovan algoritem za



strojno učene za pridobitev novega, prilagojenega modela.

Ogrodje MCAT pričakuje naslednje klasifikatorje:

Metode pol-nadzorovanega učenja lahko kategoriziramo na

•

osnovni model: model, ki se je v laboratorijskem okolju

več načinov. Glede na število klasifikatorjev, glede na

izkazal za najboljšega,

število dimenzij (ortogonalnost atributnih vektorjev), glede

•

specifični model: model ali množica modelov, ki

na način prilagajanja in glede na to ali se uporablja omejena

vsebujejo znanje o specifikah trenutnega uporabnika,



14

•

selekcija: model, ki izbere končno oznako,

katere podobnost smo ugotavljali. Cilj je pridobiti množico

•

dodajanje: model, ki se odloča, ali je trenutna instanca

oseb, ki so najbolj podobni novemu uporabniku in iz njihovih

dovolj kvalitetna za dodajanje v učno množico

modelov oceniti porabo energije novega uporabnika.

osnovnega modela.

Točnost ocene mora biti višja od splošnega modela, ki je

Cilje trenutne raziskave je uporabiti isto ogrodje na

naše izhodišče.

regresijski domeni, bolj specifično za oceno porabe človeške



energije. Označevanje podatkov za novega uporabnika je v

3.1 Razvrščanje v skupine ali gručenje



tem primeru onemogočeno, saj bi uporabnik moral v

Za razvrščanje v skupine smo uporabili algoritem k-means iz

laboratorij, kjer se nahajajo potrebne naprave (Cosmed

orodja za strojno učenje Weka [6]. Za idealno število gruč

k4b2).

smo uporabili koeficient Silhouette, ki poda mero, kako

Ta prispevek predstavlja analizo pristopov za možnost

dobro podatek ustreza trenutni gruči. Koeficient je definiran

detekcije podobnosti med uporabniki. Privzeli bi, da pristop

z naslednjo enačbo.

z najboljšim delovanjem opiše trenutnega uporabnika

ܾሺ݅ሻ − ܽሺ݅ሻ

zadosti dobro da ga lahko uporabimo kot specifični model v

ݏሺ݅ሻ = max {ܽሺ݅ሻ,ܾሺ݅ሻ}

MCAT algoritmu.





−1 ≤ ݏሺ݅ሻ ≤ 1

2 NABOR PODATKOV





V raziskavi smo uporabili dva nabora podatkov in sicer

Za izračun koeficienta uporabnika i uporabimo

podatke, ki so uporabljeni kot učna množica splošnega

•

a(i) - povprečna razdalja vseh uporabnikov v

modela in pa nabor podatkov, ki predstavlja bio-impedanco

gruči

oseb vsebovanih v učni množici splošnega modela.

•

b(i) - najmanjša razdalja trenutnega uporabnika



do sosednje gruče

Učna množica splošnega modela je bila zbrana v

Ustreznost gruče je definirana z velikostjo koeficienta.

kontroliranem laboratorijskem okolju Fakultete za Šport in

Najbolj ustrezna delitev je pri s(i) = 1, če je koeficient blizu

vsebuje podatke 10 ljudi, ki so izvajali vnaprej določene

0 je na robu dveh gruč in če je -1 verjetno bolj ustreza drugi

sklope aktivnosti. Opremljeni so bili s pospeškomeri na

gruči. Izračunan koeficient za tri osebe lahko vidimo na Sliki

prsih in stegnu, prsnim pasom za merjenje srčnega utripa,

2. Za osebi A in B je najboljša delitev na dve gruči in za

napravo Senswear, ki meri oddajanje toplote človeka,

osebo H na 5 gruč.

galvanski odziv kože in telesno temperaturo ter oceni



človekovo porabo energije in indirektnim kalorimetrom

0.80

Cosmed k4b2, ki meri porabo energije na osnovi izdihanega

0.70

0.63

0.62

0.62

ogljikovega dioksida in porabe kisika. Ta nabor podatkov je

0.75

0.58

0.49

0.55

0.60

ette

0.52

u

0.46

bil uporabljen za gradnjo in vrednotenje ve

o

č regresijskih

0.50

0.40

0.43

0.40

ilh

modelov za oceno porabe energije. Izbran je bil najboljši, ki

0.40

t S

0.46

0.46

n

0.47

0.46

vsebuje podatke pospeškomerov, blizu telesne temperature

0.30

0.07

eficie

0.07

in sr

0.20

čnega utripa. Ta model je privzet za splošni model,

o

0.33

0.03

0.04

0.31

K

0.31

0.10

0.25

deluje s povprečno absolutno napako (MAE) 0.55 MET

0.00

(Metabolic Equivalent of Task).

1

2

3

4

5

6

7

8

9

10



Število gruč

Učna množica bio-impedance so podatki, pridobljeni iz

Oseba A

Oseba B

Oseba H



naprave InBody [1], ki analizira sestavo telesa. Podatki

Slika 2: Silhouette koeficient ustreznosti delitve.

vsebujejo: višino, težo, starost, količino vode v celicah, izven



celic, količino proteinov, mineralov, maščobe, maso skeleta,

S to metodo smo dobili gruče podobnih oseb.

indeks telesne teže, razmerje med pasom in boki in podatke



o teži udov. Vsebuje tudi maksimalne in minimalne

3.2 Meta klasifikacija

vrednosti za vsak tip podatkov, kar smo uporabili na



normalizacijo in dodali se maksimalen in minimalen srčni

Za uteževanje ocen smo poizkusili še meta-klasifikator za

utrip uporabnika. Ta je bil umerjen med 15 minutnim

vsako osebo posebej. Za meta-klasifikator smo uporabili

ležanjem (minimalen srčni utrip) in po dveh minutah

podatke osmih oseb pri ocenjevanju devete. Za končne

intenzivnega teka (maksimalen srčni utrip). To učno

evaluacijo smo uporabili deseto osebo.

množico smo uporabili za ugotavljanje podobnosti med

Začetno množico atributov meta klasifikatorja sestavljajo

uporabniki.

naslednji atributi:



•

evklidske razdalje od trenutne osebe do vseh

3 PRISTOP ZA UGOTAVLJANJE PODOBNOSTI

oseb v gruči,

MED UPORABNIKI

•

trenutna razpoznana aktivnost osebe,



Podobnost med uporabniki smo analizirali z uporabo nabora

•

nivo aktivnosti (nizka, srednja, visoko),

podatkov bio-impedance in testirali na podatkih osebe,

•

normaliziran srčni utrip osebe,



15

Tabela 1: Rezultati glede na pristop ugotavljanja podobnosti med uporabniki. Pristopi so opisani v sekciji 3.3.





Pristopi



Splošni model (MAE) Število gruč Število oseb v gruči A

B

C

D

E

F

Oseba A

0.49

2

8

0.53 0.49 0.49 0.49 0.48 0.48

Oseba B

0.69

2

3

0.77 0.69 0.70 0.69 0.73 0.69

Oseba C

0.64

3

4

0.75 0.60 0.61 0.60 0.58 0.59

Oseba D

0.55

4

1

0.93 0.54 0.54 0.54 0.48 0.49

Oseba E

0.44

2

8

0.40 0.47 0.48 0.47 0.44 0.44

Oseba F

0.55

2

8

0.68 0.60 0.60 0.60 0.55 0.55

Oseba G

0.57

2

8

0.50 0.61 0.62 0.61 0.56 0.56

Oseba H

0.46

5

2

0.42 0.51 0.51 0.51 0.46 0.46

Oseba I

0.64

2

8

0.67 0.63 0.63 0.63 0.72 0.63

Oseba J

0.50

6

1

0.65 0.47 0.47 0.47 0.53 0.50

Povprečno

0.55



0.63 0.56 0.56 0.56 0.55 0.54



•

povprečna absolutna napaka ocene modela

Pristop B: Vsako instanco ocenijo modeli oseb, ki so v gruči

osebe glede na oceno splošnega modela,

,in končna ocena je povprečje ocen.

•

cona srčnega utripa po metodi Zoladz [8],



•

procent povprečne absolutne napake ocene

Pristop C: Vsako instanco ocenijo modeli oseb, ki so v gruči,

modela glede na oceno splošnega modela.

in končna ocena je utežena vsota glede na evklidsko razdaljo



do centroide v gruči.

Delovanje meta-klasifikatorja je naslednje. Vsako instanco



se oceni z modeli oseb, ki so v gruči, in vsaka ocena je

Pristop D: Vsako instanco ocenijo modeli oseb, ki so v gruči,

ovrednotena s svojim meta-klasifikatorjem, ki vrne enega od

in končna ocena je utežena vsota glede na evklidsko razdaljo

dveh razredov: »da« ali »ne«. Da pomeni, da se ocena

do nove osebe v gruči. Če je v gruči ena oseba je rezultat

uporabi, in ne, da se zavrže. Poleg vsake klasifikacije

utežena vsota splošnega modela in modela osebe.

klasifikator vrne stopnjo zaupanja v svojo napoved. Končna



ocena se izračuna glede na število modelov, katerih rezultat

Pristop E: Za oceno so uporabljeni meta klasifikatorji in

je bil »da«:

modeli vseh oseb.

-

število »da« > 1; normalizira se stopnja zaupanja za



vsak model, ki je klasificiral »da«. Normalizirane

Pristop F: Za oceno so uporabljeni meta klasifikatorji in

stopnje se uporabijo kot utež trenutne ocene in

modeli oseb v gruči.



utežena vsota vseh tvori končno oceno,



-

število »da« = 1; stopnja zaupanja je uporabljena

4 REZULTATI

kot utež ocene tega modela. Ostanek je uporabljen



kot utež ocene splošnega modela. Utežena vsota

Rezultati predstavljajo evaluacijo vseh omenjenih pristopov.

obeh tvori kon

Cilj je izbrati pristop, ki vrača manjšo ali primerljivo točnost

čno oceno,

-

število »da« = 0; uporabi se ocena splošnega

splošnemu modelu. Rezultati so predstavljeni v Tabeli 1 in

modela

sicer z povprečno absolutno napako (MAE) definirano z

naslednjo enačbo:



௡

Uporabnost atributov smo ovrednotili s kombiniranjem vseh

1

ܯܣܧ =



in izločili tiste atribute, ki ne pripomorejo k boljši točnosti

݊ ෍หܧܧ௢௖௘௡௝௘௡௔ − ܧܧ௣௥௔௩௔ห

izbire in hkrati točnosti ocene. Atributi, ki so ostali v

௜ୀଵ

Končna ocena najboljšega pristopa je ocenjena z povprečno

končnem vektorju atributov, so:

absolutno procentualno napako definirano z naslednjo

•

evklidske razdalje od trenutne osebe do vseh

enačbo (MAPE):

oseb v gruči,

100% ௡ ܧܧ

•

trenutna razpoznana aktivnost osebe,

ܯܣܲܧ =

௢௖௘௡௝௘௡௔ − ܧܧ௣௥௔௩௔

݊ ෍ ቤ

ܧܧ

ቤ

•

nivo aktivnosti (nizka, srednja, visko),

௣௥௔௩௔

௜ୀଵ

•

cona srčnega utripa po Zoladz metodi [8].





V obeh enačbah EEocenjena predstavlja oceno porabe energije,

3.3 Pristopi

kot jo vrne regresijski model in EEprava je izmerjena poraba



Pristop A: Vsako instanco oceni devet modelov (posamezni

energije.

model osebe) in kon

Točnost splošnega modela je predstavljena v drugem stolpcu

čna ocena je povprečje ocen.



Tabele 1. Povprečna napaka modela je 0.55 MET in MAPE



modela je 25%. Prvi pristop (pristop A) uporabi povprečno



16

oceno vseh oseb. Iz rezultata lahko vidimo, da se napaka

[8] Zoladz, http://en.wikipedia.org/wiki/Heart_rate

poveča in da ta pristop ni pravilen, kar je tudi v skladu s



hipotezo, da uporabljen model mora biti podoben modelu



končne osebe. Pristop B uporabi dodatno znanje o

medsebojni podobnosti oseb in za končno oceno uporabi

povprečje ocen podobnih oseb (osebe v isti gruči). Rezultat

je slabši od splošnega modela, tako v obliki MAPE 26% kot

tudi MAE 0.56 MET. Pristop C uporabi za utež napovedi

evklidsko razdaljo osebe do centroide. Končna točnost je

slabša od splošnega modela in sicer 0.56 MET in 26% v

obliki MAPE. Pristop D vrne primerljive rezultate kot

pristopa B in C. Pristop E uporabi meta-klasifikator, vendar

na vseh osebah. Iz rezultata lahko vidimo, da z vpeljavo meta

klasifikatorja dosežemo primerljivo točnost, kot ga dobimo s

splošnim modelom. Če uporabimo meta klasifikatorje samo

na osebah ki so v gruči, pa pridobimo na točnosti in sicer

0.01 MET v obliki MAE in 3 odstotne točke v obliki MAPE.



5 ZAKLJUČEK



Ta prispevek predstavlja preliminarne rezultate analize

pristopov za ugotavljanje podobnosti med uporabniki.

Analiza je bila narejena na domeni ocene porabe človeške

energije z namenom definirati specifični model za pol-

nadzorovano metodo MCAT, katero bomo v prihodnjem

delu nadgrajevali.

Pristop, ki vrača najboljšo točnost, uporablja algoritem

gručenja za delitev oseb v skupine po podobnosti in meta

klasifikatorje posameznih oseb v gruči za končno oceno

porabe energije osebe. Z uporabo pristopa za podobnost

izboljšamo rezultat najboljšega modela za 3 odstotne točke.

Prihodnje delo zajema razširitev pristopov in uporabo

najboljšega pristopa v metodi MCAT.



References



[1] Frinken, V., Bunke, H.: Self-training Strategies for

Handwriting Word Recognition. In: Perner P. (eds.)

Advances in Data Mining. Applications and Theoretical

Aspects. LNCS, vol. 5633, pp. 291--300, 2009.

[2] Zhou, Y., Goldman, S.:Democratic Co-Learning. In:

16th IEEE International Conference on Tools with

Artificial Intelligence, pp. 594--602, IEEE press, 2004

[3] Blum, A. and Mitchell, T.: Combining labeled and

unlabeled data with co-training. In: 11th annual

conference on Computational learning theory, pp. 92--

100, 1998.

[4] B. Cvetković, B. Kaluža, M. Luštrek, M. Gams,

“Adapting Activity Recognition to a Person with Multi-

Classifier Adaptive Training,” Journal of Ambient

Intelligence and Smart Environments. Accepted for

publication, 2014.

[5] InBody, http://www.e-inbody.com/

[6] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P.

Reutemann, I. H. Witten. The WEKA Data Mining

Software: An Update. SIGKDD Explorations. 11(1), pp.

10–18, 2009.

[7] Silhouette

koeficient,

http://en.wikipedia.org/wiki/Silhouette_(clustering)



17

VISUALIZATION OF EXPLANATIONS OF INCREMENTAL

MODELS

Jaka Demšar, Zoran Bosnić , Igor Kononenko

University of Ljubljana, Faculty of Computer and Information Science

Večna pot 113, SI-1000 Ljubljana, Slovenia

e-mail: jaka.demsar0@gmail.com, {zoran.bosnic, igor.kononenko}@fri.uni-lj.si



ABSTRACT

Another method, Page-Hinkley test [10] was devised to



The temporal dimension that is ever more prevalent in

detect the change of a Gaussian signal and is commonly

data makes the data stream mining ( incremental

used in signal processing.

learning) an important field of machine learning. In

Bare prediction quality is not a sufficient property of a good

addition to accurate predictions, explanations of models

machine learning algorithm. Explanation (a form of data

and examples are a crucial component as they provide

postprocessing) of individual predictions and model as a

insight into model's decision and lessen its black box

whole is needed to increase the user's trust in the decision

nature, thus increasing the user's trust. Proper visual

and provide insight in the workings of the model, which

representation of data is also very relevant to user's

increases the models credibility. IME ( Interactions-based

understanding -- visualization is often utilised in

Method for Explanation) [13] with its efficient adaptation

machine learning since it shifts the balance between

[12] is a model independent method of explanation, which

perception and cognition to take fuller advantage of the

also addresses interactions of features and therefore

brain's abilities. In this paper we review visualisation in

successfully tackles the problem of redundant and

incremental setting and devise an improved version of

disjunctive concepts in data. The explanation of the

an existing visualisation of explanations of incremental

prediction for each instance is defined as a vector of

models. We discuss the detection of concept drift in data

contributions of individual feature values. Positive

streams and experiment with a novel detection method

contibution implies that the particular feature value

that uses the stream of model's explanations to

positively influenced the prediction (and vice versa) while

determine the places of change in the data domain.

the absolute value of a contributon is proportional to the



magnitude of influence on the decision, i.e. the importance

1 INTRODUCTION

of that feature value. The sum of all contributions is equal to



the difference between the prediction using all feature values

Data streams are becoming ubiquitous. This is a

and a prediction using no features (prediction difference).

consequence of the increasing number of automatic data

The explanation of a single prediction can be expanded to

feeds, sensoric networks and internet of things [1]. The

the whole model [12] and also to incremental setting [3]. In

defining characteristics of data streams are their transient

the latter case, drift detection (SPC) and adaptation are used

dynamic nature and temporal component. In contrast with

to compensate for concept drift. Explanation of a data stream

static datasets (used in batch learning), data streams (used in

is therefore itself a data stream.

incremental learning) are large, changing, semi-structured

Related to exaplantion is data visualisation - a versatile tool

and possibly unlimited. This poses a challenge for storage

in machine learning that serves two purposes; sense-making

and processing as the data can be only read once. For

(data analysis) and communication as it conveys abstract

incremental learning models, operations of model increment

concepts in a form, understandable to humans (it shifts the

and decrement are vital. Concepts and patterns in data

balance between perception and cognition to take fuller

domain can change ( concept drift) - we need to adapt to this

advantage of the brain's abilities [4]). The majority of

phenomenon or the quality of our predictions deteriorates.

published visualizations depict data that has a temporal

According to PAC ( Probably approximately correct)

component [8]. In this context, visualization acts as a form

learning model, if the distribution, generating the instances

of summarization, since the datasets can be extremely large.

is stationary, the error rate for sound machine learning

The challenge lies in representing the temporal component

algorithms will decline towards the Bayes error rate as the

(including concept drift), especially if we are limited to two-

number of processed instances increases [9]. Consequently,

dimensional non-interactive visualisations.

when a statistically significant rise in error rate is detected,

The main goal of this paper is improving the existing

we can suggest that there has been a change in the

methodology for visualising explanations of incremental

generating distribution - concept drift.

models [3]. The feature value contributions are represented

The basis of statistical process control ( SPC) [5] is

with customised bar charts. Multiple such charts are

detecting statistically significant error rate (using the central

required to explain the model at diffrerent points in time.

limit theorem) by monitoring the mean and standard

They become very difficult to read as a whole because of

deviation of a sequence of correct classification indicators.

the large number of visual elements that we have to

compare (we sacrifice macro view completely in favour of



18

micro view). To consolidate these images and address the

out (micro view), while general patterns and trends can be

change blindness phenomenon, charts are stacked into a

recognised in the shapes of lines that are intuitive

single plot, where the age and size of the exaplanation are

representations of flowing time (macro view). The resulting

represented with transparency (older and "smaller"

visualisations

are

dense

with

information,

easily

explanations fade out). The resulting visualisation is not

understandable (conventional plotting of independent

tainted by first impressions (as it is only one image) and is

variable, time, on x axis) and presented in gray-scale palette,

adequately dense and graphically rich. However, the major

making them more suitable for print.

flaw of this approach lies in the situations when columns,



representing newer explanations override older ones and

3 DETECTING CONCEPT DRIFT USING THE

thus obfuscate the true flow of changing explanations, for

STREAM OF EXPLANATIONS



example, when the concept drift precipitates the attribute

When explaining incremental models, the resulting

value contributions to increase in size without changing the

explanations are, in themselves, a data stream. This gives us

sign. Concepts can therefore become not only hidden;

the option to process it with all the methods used in

what's more, the visualization can be deceiving, which we

incremental learning. In our case, we'll devise a method to

consider to be worse than just being too sparse. Therefore,

detect outliers in the stream of explanations and declare such

we need to clarify the presentation of the concept drift along

points as places of concept drift. The reasoning behind this is

with an accurate depiction of each explanation's

the notion that if the model does not change, then also the

contributions while maintaining the macro visual value, that

explanation of the whole model will not change. When an

enables us to detect patterns and get a sense of true concepts

outlier is detected, we consider this to be an indicator of a

and flow of changes behind the model.

significant change in model and thus also in the underlying

An additional goal was to devise a method of concept drift

data. In addition to this, the method provides us with a

detection which monitors the stream of explanations and

stream of explanations that is continuous to a certain degree

detects anomalies in it; the detected anomalies are

of granularity and so enables us to overview the concepts

interpreted as a concept drift. We test the improved

behind the data at more frequent intervals than the existing

visualization and the novel concept drift detection method

explanation methodology.

on two datasets and evaluate the results.

We use a standard incremental learning algorithm [5] (learn



by incrementally updating the model, decrement the models

2 VISUALISATION FOR INCREMENTAL MODELS

if it becomes too big according to the parameter, rebuild the



When visualising explanations of individual predictions,

model if we detect change [6]) and introduce some

horizontal bar charts are a fitting method also in the

additional parameters. Granularity determines how often the

incremental setting. Individual examples are always

explanation of the current model will be triggered. The

explained according to the current model which, in our case,

generated stream of explanations (vectors of feature value

can change. This is not an obstacle, since the snapshot of the

contributions) will be compared using cosine distance. For

model is in fact the model that classified the example.

each new explanation, the average cosine distance from all

This approach fails with explanations of incremental models

other explanations that are in the current model, is

as we need a new figure for each local explanation. To

calculated. These values are monitored using the Page

successfully represent the temporality of incremental models,

Hinkley test. When the current average cosine distance from

we use two variations of a line plot where the x axis contains

other explanations has risen significantly, we interpret that as

time stamps of examples and the splines plotted are various

a change in data domain - concept drift. The last examples

representations of contributions ( y axis).

are then used to rebuild the model, the Page Hinkley statistic

The first type of visualization (Figures 2 and 3) has one line

and the local explanation storage are reset (to monitor the

plot for each attribute. Contributions of values of the

new model).

individual attribute are represented with line styles. The

The cosine distance is chosen because, in the case of

mean positive and mean negative contribution of the

explanations, we consider the direction of the vector of

attribute as a whole are represented with two thick faded

contributions to be more important than its size, which is

lines. Solid vertical lines indicate the spots where

very influential in the traditional Minkowski distances. The

explanation of the model was triggered (and therefore

page Hinkley test is used in favour of SPC because of its

become the joints for the plotted splines), while dashed

superior drift detection times [9] and the lack of need for a

vertical lines mark the places where the actual concept drift

buffer - examples are already buffered according to the

occurs in data. The second type is an aggregated version

granularity. The method is therefore model independent.

(Figure 3) where the mean positive and mean negative



contributions of all attributes are visualized in one figure. In

4 RESULTS



these two ways we condense the visualization of incremental

4.1 Testing methodology and datasets

models without a significant loss in information while still



providing a quality insight into the model. Exact values of

We test the novel visualisation method and the concept drift

contributions along with timestamps of changes can be read

detection method on two synthetic datasets, both containing

multiple concepts with various degrees of drift between



19





them. These datasets are also used in previous work [3], so a

direct assessment of visualization quality and drift detection

performance can be made. The naive Bayes classifier and the

nearest neighbour classifier are used. Their usage yields very

similar results in all tests, so only results obtained by testing

with Naive Bayes are presented.

SEA concepts [11] is a data stream comprising 60000

instances with continuous numeric features xi ∈ [0,10],

where i ∈ {1,2,3}. x1 and x2 are relevant features that

determine the target concept with x1+x2 ≤ β where threshold

β ∈ {7,8,9,9.5}. Concepts change sequentially every 15000



examples. Although the changes between the generated

Figure 2: P eriodically triggered explanations (SEA).

concepts are abrupt, class noise is inserted into each block.

The instances of second dataset, STAGGER [2], represent

geometrical shapes which are in the feature space described

by size, color and shape. The binary class variable is

determined by one of the three target concepts ( small ˄

green, green ˅ square, medium ˅ large ). 4500 instances are

divided into four blocks (concept-wise) with examples

mixing near the change points according to a sigmoid

function, so the dataset includes gradual concept drift.



4.2. Improved visualizations



Concept drifts in STAGGER dataset are correctly detected

and adapted to as reflected in Figure 3. The defined concepts

can be easily recognized from explanations triggered by the

SPC algorithm - the change in explanation follows the

change in concept. Windows generated by the vertical lines

give us insight in local explanations of the model (where the

concept is deemed to be constant). Disjunct concepts (2 and

3) and redundant feature values are all explained correctly

(e.g. reduncacy of shape and disjunction of size values in



concept 3). Figure 1 demonstrates how classifications of two

Figure 3: E xplanations triggered at change detection

instances with same feature values can be explained

(STAGGER).

completely differently at different times - adapting to change

is crucial in incremental setting. This is also evident in the

explanations on the STAGGER dataset yielded positive

aggregated visualization, which can be used to quickly

results. As depicted in Figure 4, the method correctly detects

determine the importance of each attribute.

concept drifts without false alarms and is in that regard

For SEA dataset, explanations of instances are tightly

similar to SPC method. The stream of explanations was

corresponding to explanations of the model. As evident in

similar to those obtained with other successful drift detection

Figure 2, the shape of contributions of features reflects the

methods. Choices of larger granulations yielded similar

target concept; lower values increase the likelihood of

results, but the change detection was obviously delayed. The

positive classification and vice versa. Feature x

concept drift was however never missed, provided that the

1 is correctly

explained as irrelevant with its only contributions being the

granulation was smaller that the spacing between sequential

result of noise.

changes in data. The delays of concept drift detection are



correlated with the magnitude of change. For example, the

4.3. Concept drift detection

last concept drift was detected with significant delay. In this



regard, the proposed method is inferior to SPC algorithm -

Evaluating the concept drift detection using the stream of

the concept drift detection in noticeably delayed and we're

also dependant on two parameters – granulation and alert

threshold, so the generality of the method is diminished.

When testing with SEA datasets, the concept drift was not

correctly detected. Changing the granulation and Page



Hinkley alert threshold parameter resulted in varying degrees

Figure 1: Explanations of a single prediction at different

of false alarms or non reaction to change (Figure 4). This

times.

behaviour can be attributed to a small magnitude of change

that occurs in data - the difference between concepts in data



20



is quite small and continuous. However, when explaining

possibilities for visualisation would emerge, particularly

this (incorrectly adapted) model, we still recognise true

those that rely on finely granular data, such as ThemeRiver

underlying concepts. This can be attributed to automatically

[7].

decrementing the model when it becomes too big. It is



important to note that this does not perform well in general,

if the prior knowledge is insufficient for us to correctly

decide on the maximum model size.

We conclude that, in this form, the presented method is not a

viable alternative to the existing concept drift detection

methods. Its downsides include high level of parametrization

which requires a significant amount of prior knowledge and

can also become improper if the model changes drastically.

Consequently, another assessment of data is needed - the

required manual supervision and lack of adaptability in this

regard can be very costly and against the requirements of a

good incremental model. The concept drift detection is also

not satisfactory - it is delayed in the best case or concepts

can be missed or falsely alerted in the worst case. Another



downside is the time complexity - the higher the granularity

Figure 4: Performance of various change detection methods.

the more frequent explanations will be, which will provide

Yellow line indicates true change in concepts, green line

us with a good stream of explanations but be very costly

indicates change detection and adaptaion).

time-wise. The method is therefore not feasible in

environments where quick incremental operations are vital.



However, if we can afford such delays, we get a granular

References

stream of explanations which gives us insight into the model



[1] C. C. Aggarwal, N. Ashish, A. P. Sheth. The internet of

for roughly any given time.

things: A survey from the data-centric perspective. In

A note at the end: we should always remember that we are

Managing and Mining Sensor Data. Springer, 2013.

explaining the models and not the concepts behind the

[2] A. Bifet, G. Holmes, R. Kirkby, B. Pfahringer, M. Braun.

model. Only if the model performs well, we can claim that

Moa: Massive online analysis.

our explanations truly reflect the data domain [12]. This can

[3] Jaka Demšar. Explanation of predictive models and individual

be tricky in incremental learning, as at the time of a concept

predictions in incremental learning (In Slovene). B.S. Thesis,

drift, the quality of the model deteriorates.

University of Ljubljana, 2012.



[4] S. Few. Now You See It: Simple Visualization Techniques for

Quantitative Analysis

5 CONCLUSION

. Analytics Press, 1st edition, 2009.



[5] J. Gama. Knowledge Discovery from Data Streams. Chapman

The new visualization of explanation of incremental model is

& Hall/CRC, 1st edition, 2010.

indeed an improvement compared to the old one. The

[6] D. Haussler. Overview of the probably approximately correct

overriding nature of the old visualisation was replaced with

(PAC) learning framework, 1995.

an easy to understand timeline, while the general concepts

[7] S. Havre, B. Hetzler, and L. Nowell. Themeriver: Visualizing

(macro view) can still be read out from the shape of the

theme changes over time. In Proc. IEEE Symposium on

Information Visualization, 2000.

lines. Micro view is also improved as we can determine

[8] C. Ratanamahatana, J. Lin 0001, D. Gunopulos, E. J. Keogh,

contributions of attribute values for any given time.

M. Vlachos, and G. Das. Mining time series data. In The Data

The detection of concept drift using the stream of

Mining and Knowledge Discovery Handbook. Springer, 2005.

explanations did not prove to be suitable for general use

[9] R. Sebastião and J. Gama. A study on change detection

based on the initial experiments. It has shown to be hindered

methods. In Progress in Artificial Intelligence, 14th

by delayed detection times, missed concept drift

Portuguese Conference on Artificial Intelligence, EPIA 2009

occurrences, false alarms, high level of parametrization and

[10] E. S. Page. Continuous Inspection Schemes. Biometrika, Vol.

potential high time complexity. This provides motivation for

41:100-115, 1954.

further experiments in this field, especially because the

[11] W. N. Street and Y. S. Kim. A streaming ensemble algorithm

for large-scale classification. In Proceedings of the 7th ACM

stream of explanations provides good insight into the model

SIGKDD international conference on Knowledge discovery

with accordance to the chosen granulation.

and data mining, KDD ’01, New York, NY, USA, 2001.

The main goal of future research is finding a true adaptation

[12] E. Štrumbelj and I. Kononenko. An efficient explanation of

of IME explanation methodology to incremental setting, i.e.

individual classifications using game theory. The Journal of

efficient incremental updates of explanation at the arrival of

Machine Learning Research, 11:1–18, 2010.

each

new

example.

Truly

incremental

explanation

[13] E. Štrumbelj, I. Kononenko, and M. Robnik Šikonja.

methodology would provide us with a stream of explanations

Explaining instance classifications with interactions of subsets

of finest granularity. In addition to this, a number of new

of feature values. Data Knowl. Eng., 68(10):886–904,

October 2009.



21

DETECTION OF IRREGULARITIES ON AUTOMOTIVE SEMIPRODUCTS

Erik Dovgan1, Klemen Gantar2, Valentin Koblar3,4, Bogdan Filipič1,4

1 Department of Intelligent Systems, Jožef Stefan Institute, Jamova cesta 39, SI-1000 Ljubljana,

Slovenia

2 Faculty of Computer and Information Science, University of Ljubljana, Večna pot 113, SI-1000

Ljubljana, Slovenia

3 Kolektor Group d.o.o., Vojkova ulica 10, SI-5280 Idrija, Slovenia

4 Jožef Stefan International Postgraduate School, Jamova cesta 39, SI-1000 Ljubljana, Slovenia

erik.dovgan@ijs.si, kg6983@student.uni-lj.si, valentin.koblar@kolektor.com,

bogdan.filipic@ijs.si

ABSTRACT

ous classification modes, including the detection of individual

The use of applications for automated inspection of

types of irregularities with binary classifiers and the classifi-

semiproducts is increasing in various industries, including

cation of all types of irregularities with a single classifier.

the automotive industry. This paper presents the devel-

The paper is further organized as follows. The problem of

opment of an application for automated visual detection

detecting irregularities on commutators is presented in Sec-

of irregularities on commutators that are parts of vehi-

tion 2. Section 3 describes the application for detecting ir-

cle’s fuel pumps. Each type of irregularity is detected on

regularities that was designed and implemented in a proto-

a partition of the commutator image. The initial results

type form for a specific production line. The experiments and

show that such an automated inspection is able to reliably

results from the development process are presented and dis-

detect irregularities on commutators. In addition, the re-

cussed in Section 4. Finally, Section 5 concludes the paper

sults confirm that the set of attributes used to build the

with some ideas for future work.

classifiers for detecting individual types of irregularities

2

PROBLEM DESCRIPTION

and the priority of these classifiers significantly influence

the classification accuracy.

Commutators are parts of electric motors that periodically re-

verse the current direction between the rotor and the external

1





INTRODUCTION


circuit. If the electric motor is installed in the vehicle’s fuel

pump, it has to withstand the chemical stress, which is usu-

Information technology (IT) is replacing human work in nu-

ally not the case for other types of electric motors. There-

merous domains. Such a technology is especially suitable

fore, special graphite-copper commutators are produced for

for repetitive non-creative procedures where high accuracy is

this purpose.

required. Automotive industry is introducing IT in various

The production of graphite-copper commutators involves

segments, for example in storage management and automated

several stages. Among them the most critical one is soldering

inspection of semiproducts.

of graphite and copper parts of the commutator. The quality

Automated inspection of semiproducts can be done by an-

of the soldered joint is crucial for the quality of the commuta-

alyzing data from several sources, such as sensors, lasers and

tor since even the smallest joint irregularity is unacceptable.

cameras. Utilization of cameras for this purpose has several

During the soldering phase, four types of irregularities may

advantages, e.g., it is fast, thus not slowing down the pro-

occur:

duction line, it is cheaper in comparison to highly specialized

sensors, and the same hardware can be used for inspection of

1. metalization defect, i.e., there are visible defects on the

heterogeneous semiproducts.

metalization layer,

This paper presents the development of an application for

2. excess of solder, i.e., more solder is applied than feasi-

automated visual inspection of commutators as semiproducts

ble,

for automotive industry. This application processes images

of commutators with computer vision algorithms to obtain

3. deficit of solder, i.e., less solder is applied than feasible,

the attributes describing visual properties of the commutator.

and

These attributes are then used by machine learning algorithms

4. disoriented, i.e., the copper part is not appropriately ori-

to classify the commutators, i.e., to determine whether or not

ented with respect to the graphite part.

they contain irregularities and, in the case they do, what is

the type of irregularities. Experimental detection of irregular-

The analysis of these irregularities showed that each type oc-

ities was performed using various sets of attributes and vari-

curs only on a specific part of the commutator. Consequently,

22

when visually inspecting the commutator, its image can be

filter is used, where the size of the median window is an input

partitioned into four segments, each showing the presence or

parameter.

the absence of an individual type of irregularity, and into the

In the next step, a threshold function is used to eliminate

rest of the image that can be disregarded since it contains no

pixels that are not relevant for detecting the irregularities.

information about the irregularities. Currently, these irregu-

Each ROI is processed with a specific value of the binary

larities are detected through manual inspection of the commu-

threshold. This step results in a black (background) and white

tators. This approach is time-consuming and its results may

(relevant regions) image.

be subjective. The goal of this research is to design and im-

Connected pixels are then grouped together with the

plement an automated visual inspection of commutators that

connected-component labeling algorithm [5] in order to de-

would overcome the weaknesses of the manual inspection.

tect the connected regions. This enables to process relevant

regions, i.e., particles, rather than single pixels.

3

AUTOMATED VISUAL INSPECTION OF

In the last image processing step, a particle filter is used to

GRAPHITE-COPPER COMMUTATORS

remove small particles that can be present in the image due to

noise. The size of the particles to be filtered is an additional

The idea for the automated visual inspection of graphite-

input parameter.

copper commutators is to consist of three phases. Firstly,

After the image is processed with the computer vision al-

a digital image of the commutator is obtained. Secondly,

gorithms, the following six attributes are calculated for each

this image is processed using computer vision algorithms that

ROI, i.e., for each type of irregularity:

extract informative attributes. Finally, these attributes are

used by classifiers to determine whether the irregularities are

• the number of particles,

present on the commutator and identify their type in the case

of their presence. Before applying this inspection procedure

• the cumulative size of particles in pixels,

on the production line, the classifiers need to be built with

machine learning algorithms.

• the maximal size of particles in pixels,

3.1

Processing commutator images with computer

• the minimal size of particles in pixels,

vision algorithms

• the gross/net ratio of the largest particle, and

Commutator images are processed in several steps. Since the

commutators are not properly aligned, their rotation angle and

• the gross/net ratio of all particles.

position in the image have to be determined first. The center

of the commutator is detected by matching the image with

These attributes are then used to build the classifiers and clas-

the template image of the center. Next, the position of the

sify the commutator images.

commutator’s pin is found. The line between between the

center of the commutator and the pin is used to determine the

3.2

Learning classifiers with machine learning al-

rotation angle.

gorithms

The next step of image processing consists of determining

The goal of the classifiers is to determine whether a commuta-

four regions of interest (ROIs), one for each type of irregu-

tor contains any irregularities. Two approaches were applied

larities. Each ROI is obtained by applying the corresponding

to solve this classification problem:

binary mask to the image. Before applying the binary mask,

the mask has to be properly positioned and rotated. To that

1. all the attributes were included in a single set of at-

end, the information about the center of the commutator and

tributes and a single classifier was built to classify the

its rotation angle (obtained in the previous step) is used. As

commutators into one out of five possible classes (either

a result, four ROIs are obtained. They are further processed

one of the four types of irregularities or no irregularity),

with the same sequence of computer vision algorithms, where

2. each type of irregularity was detected with a binary clas-

only the input parameter values of these algorithms are spe-

sifier, where the binary classifiers were prioritized to

cific for each ROI.

determine the irregularity when irregularities of several

At this stage, ROIs are in RGB format. However, prelim-

types were detected.

inary tests showed that in order to reliably detect the irregu-

larities, only one color plane should be used. Moreover, these

The classification approach using four binary classifiers was

tests also showed, that the most appropriate color plane is the

further structured based on the attributes and learning in-

red one, with the exception of the excess of solder irregularity

stances used when building the binary classifiers. Specifi-

for which the best color plane is the blue one. Consequently,

cally, when building a binary classifier for detecting irregu-

the most appropriate color plane is extracted from each ROI

larities of a particular type, four learning modes were tested:

with respect to the observed irregularity. This extraction re-

sults in gray-scale ROIs.

1. only attributes of the corresponding ROI and only com-

Gray-scale ROIs are then filtered with the median filter to

mutators that are either without irregularities or contain

reduce noise from the images. For this purpose a 2D median

irregularities of this particular type are used,

23

Class

Number of images

Learning mode

Best priority

Max. accuracy [%]

Without irregularities

212

1

C1, C3, C2, C4

81.8

Metalization defect

35

2

C3, C2, C1, C4

77.1

Excess of solder

35

3

C2, C3, C1, C4

81.5

Deficit of solder

49

4

C1, C3, C4, C2

83.5

Disoriented

32

Table 3: The best binary classifier priorities and classifica-

Table 1: Distribution of test images.

tion accuracies of learning modes.

Median

Threshold Particle

Highest

Best learning

Max.

Class

window size

value

size

priority

mode

accuracy [%]

Metalization defect

3

54

13

C1

4

83.5

Excess of solder

3

5

2

C2

4

83.2

Deficit of solder

5

78

760

C3

4

83.5

Disoriented

1

81

184

C4

4

83.5

Table 2: Input parameter values for the computer vision al-

Table 4: The best learning modes and classification accura-

gorithms.

cies of binary classifier priorities.

2. all attributes, but only commutators that are either with-

to, for example, correctly classify a commutator with irregu-

out irregularities or contain irregularities of this particu-

larity x1 when the binary classifier for irregularity x2 is used.

lar type are used,

Such performance is not guaranteed when building the binary

classifiers for irregularity xi without taking into account the

3. only attributes of the corresponding ROI, but all commu-

images of irregularities xj, i = j (learning modes 1 and 2).

tators including irregularities of all types are used, and

Consequently, when classifying the commutators with pre-

viously unseen irregularities (learning modes 1 and 2), the

4. all attributes and all commutators including irregularities

classification accuracy varies significantly with respect to the

of all types are used.

priority of classifiers as shown in Figure 1. These results also

confirm that partitioning the classification problem into four

4

EXPERIMENTS AND RESULTS

subproblems, one for each irregularity type, results in higher

The proposed method for detecting irregularities was tested

classification accuracy, but only if all attributes and commu-

on a set of images of commutators without irregularities and

tators with all irregularities are used when building the binary

the ones containing irregularities. The distribution of the test

classifiers (see the classification accuracy of learning mode 4

images among the irregularity classes is shown in Table 1.

in comparison to classification accuracy of the single classi-

The applied computer vision algorithms were imple-

fier in Figure 1). On the other hand, when building the binary

mented in Open Computing Language (OpenCL) [3] that is

classifiers from the reduced set of attributes or the reduced set

suitable for deploying on embedded many-core platforms and

of irregularities, the obtained classification accuracy is lower

installing in the production environments. More precisely, we

than the classification accuracy of the single classifier. Fi-

used the OCL programming package [2], which is an imple-

nally, these results show that the priority of classifiers influ-

mentation of OpenCL functions in the Open Computer Vision

ences the classification accuracy. The priority is especially

(OpenCV) library [4]. The connected-component labeling al-

important when using learning modes 1 and 2.

gorithm was implemented based on description from [5]. The

The results were further analyzed with respect to various

input parameter values of computer vision algorithms were

priorities of binary classifiers and learning modes (see Tables

determined using a tuning procedure described in [1] and are

3 and 4). For this purpose, the binary classifiers were abbre-

shown in Table 2. The classifiers were built using the Weka

viated as follows:

machine learning environment [7]. In particular, the J48 algo-

• C

rithm, the Weka’s implementation of the C4.5 algorithm for

1 – the binary classifier for detecting metalization de-

fects,

building decision trees [6], was used for this purpose.

Figure 1 shows the classification accuracies obtained with

• C2 – the binary classifier for detecting the excess of sol-

the tested classifiers and learning modes. When binary classi-

der,

fiers are applied, all the permutations of priorities are tested,

therefore a distribution of classification accuracy is shown.

• C3 – the binary classifier for detecting the deficit of sol-

The results indicate that the highest classification accuracy is

der, and

obtained using learning mode 4, i.e., when the attributes de-

scribing all types of irregularities and the images of all com-

• C4 – the binary classifier for detecting disoriented com-

mutators are used to build the binary classifiers. This enables

mutators.

24

Binary classifiers, learning mode 1

Binary classifiers, learning mode 2

Binary classifiers, learning mode 3

Binary classifiers, learning mode 4

Single classifier

0.85

0.8

0.75

Classification accuracy [%]

0.7

0.65

Classifier

Figure 1: Classification accuracies of the tested classifiers and learning modes.

Table 3 shows the best priorities of binary classifiers and the

accuracy. Additional attributes could be extracted from the

corresponding classification accuracy for each learning mode.

images with machine vision algorithms. It would be also in-

This table shows that the most important binary classifier is

teresting to compare our results with the results produced by

C1 since it has the highest priority in two cases. In addi-

the existing methods for detecting irregularities on semiprod-

tion, the highest classification accuracy is obtained when this

ucts. The ultimate goal of this work is to put the automated

classifier has the highest priority. The second most important

inspection procedure into regular use on the production line.

classifier is C3 since it has the highest priority once and the

second-highest priority three times.

ACKNOWLEDGEMENT

Table 4 shows the best learning mode and the correspond-

This work has been partially funded by the ARTEMIS Joint

ing classification accuracy when the binary classifiers have

Undertaking and the Slovenian Ministry of Economic Devel-

the highest priority.

These results show that the learning

opment and Technology as part of the COPCAMS project

mode 4 is the best one irrespectively of the binary classifier

(http://copcams.eu) under Grant Agreement number

that has the highest priority. Nevertheless, when classifier

332913, and by the Slovenian Research Agency under re-

C2 has the highest priority, a lower classification accuracy is

search program P2-0209.

achieved than in other cases.

References

5

CONCLUSIONS

[1] V. Koblar, E. Dovgan, and B. Filipič. Tuning of a machine-

This paper presented the development of an automated proce-

vision-based quality-control procedure for semiproducts in au-

dure for visual detection of irregularities on graphite-copper

tomotive industry. 2014. Submitted for publication.

commutators after the soldering of graphite and copper in the

[2] OpenCL

module

within

OpenCV

library.

http:

production process. Four types of irregularities were detected

//docs.opencv.org/modules/ocl/doc/

introduction.html.

a) with a single classifier and b) by partitioning the prob-

[3] OpenCL: The open standard for parallel programming. http:

lem into four subproblems, learning the binary classifiers for

//www.khronos.org/opencl/.

each irregularity type and assigning priorities to the classi-

[4] OpenCV: Open source computer vision. http://opencv.

fiers. The results show that the highest classification accu-

org/.

racy is achieved when the binary classifiers are used that are

[5] D. P. Playne, K. A. Hawick, and A. Leist. Parallel graph com-

trained on the data of all types of irregularities. The results

ponent labelling with GPUs and CUDA. Parallel Computing,

36(12):655–678, 2010.

also indicate that the priority of classifiers significantly in-

[6] J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan

fluences the classification accuracy and therefore needs to be

Kaufmann Publishers, 1993.

taken into account.

[7] Weka Machine Learning Project.

http://www.cs.

In the future work we will test additional machine learn-

waikato.ac.nz/ml/weka/index.html.

ing algorithms for potential improvement of the classification

25



AN ELDERLY-CARE SYSTEM

BASED ON SOUND ANALYSIS

Martin Frešer1, Igor Košir2, Violeta Mirchevska1, Mitja Luštrek1

Department of Intelligent Systems, Jožef Stefan Institute, Slovenia1

Smart Com d.o.o., Slovenia2

e-mail: martin.freser@gmail.com, igor.kosir@smart-com.si,

{violeta.mircevska, mitja.lustrek}@ijs.si



ABSTRACT

reducing the costs for elderly care and the burden put on the

working-age population.

This paper proposes an elderly-care system, which uses a

This paper presents an elderly-care system, which uses a

single sensing device installed in the user's home,

single sensing device installed in the user's home, primarily

primarily based on a microphone. We present

based on a microphone. A microphone may serve both as a

preliminary results on human activity recognition from

sensor and as a communication device. As a sensor, a

sound data. The recognition is based on 19 types of sound

microphone may be used for detecting user's activity (e.g.

features, such as spectral centroid, zero crossings, Mel-

sleep, eating, opening a door) and consequently reasoning

frequency cepstrum coefficients (MFCC) and linear

about potential problems related to the user (e.g. the user did

predictive coding (LPC). We distinguished between 6

not eat whole day, the user is sleeping much more than usual).

classes: sleep, exercise, work, eating, home chores and

As a communication device, it allows the user to initiate

home leisure. We evaluated the recognition accuracy

specific services by simply saying a keyword (e.g. call for

using 4 supervised learning algorithms. The highest

help). It is also needed for remote user-carer communication.

accuracy, obtained using support vector machines, was

Elderly care based on microphone has not received a lot of

76%.

attention, although technology acceptance studies show that

most users would accept to have a microphone for home care

1. INTRODUCTION

services. Ziefle et al. [3] performed a user acceptance study

Predictions made by the Statistical Office of the European

comparing three home-integrated sensor types: microphone,

Communities state that the over-65 population in EU28

camera and positioning system. According to this study, the

expressed as a percentage of the working-age population

microphone (plus speaker) is the most accepted technology,

(aged between 15 and 64) will rise from 27% in 2014 to 50%

followed by the positioning system, while the camera is

in 2060 [1]. This demographic trend puts an immense

ranked last.

pressure to change current health and care practices, which

The paper is organized as follows. In Section 2, we

already accounts for around 10 % of EU's GDP spending [2].

describe the system architecture. Activity recognition based

Innovative remote care systems are emerging, which motivate

on sound analysis is presented in Section 3. Evaluation of the

and assist the elderly to stay independent for longer, thus

presented approach on real-world recordings follows in

Figure 1: An elderly-care system architecture.



26



Section 4. Section 5 concludes the paper and presents future

work.



2. SYSTEM ARCHITECTURE

Figure 1 presents the architecture of the elderly-care system.

Most of today’s commercial elderly-care systems offer a

so-called emergency call functionality. The user is wearing a

red button using which he/she may call for help in case of

emergency. By pressing the red button, the care-system

establishes a phone connection to a carer or a call center

through a telephone network. We use a private branch

exchange network (PBX) for establishing such calls.



We extend this functionality with sound analysis in order

Figure 2: The activity recognition process.

to provide context to the emergency call (e.g. past user

Then we extract sound features. Each feature is extracted

activity), as well as to provide higher safety – the system

in 20 ms long window. This is because sound signal is

establishes an emergency call when certain types of sound,

constantly changing and in such short window we assume that

such as screaming, are detected. In order to do so, a cloud-

is not changing statistically much. We still have to have

based system is established consisting of 4 main components:

enough samples though, so shorter windows are

sensor engine, data unit, notification engine and control unit.

inappropriate. We can also use window overlap so we lose

The sensor engine analyses the sound in the apartment in

less information. We used 20% of window overlap.

order to detect user’s activity (e.g. eating, sleep) or critical

We extracted 19 types of features and we aggregated them

sound patterns (e.g. screaming, fire alarm). The output of this

in a window of one-minute length. We put together all

engine is kept in the data unit. In case of emergency detected

recordings that were recorded in one minute, then we

through sound analysis, the control unit notifies a carer or a

extracted each feature in 20 ms window and we aggregated it

specialized call center through the notification engine about

using mean and standard deviation, so we got a feature vector,

the user who needs help and why automatic emergency call is

which represented one minute. We also tried to aggregate for

being established. When the carer responds, the control unit

one second, but we got worse results.

establishes a telephone connection with the user’s apartment

Features were: Spectral centroid, Spectral rolloff point,

through the PBX network, enabling the carer to hear what is

Spectral flux, Compactness, Spectral variability, Root mean

happening in the apartment and act accordingly.

square, Fraction of low energy, Zero crossings, Strongest



beat, Strength of strongest beat, Strongest frequency via FFT

(Fast Fourier transform) maximum, MFCC’s (Mel frequency

3. ACTIVITY RECOGNITION BASED ON SOUND

cepstrum coefficients) (13 coefficients), Linear predictive

DATA

coding (LPC) (10 coefficients), Method of moments (5

People can distinguish quite well between some everyday

features), Partial based spectral centroid, Partial based

activities just by listening to them. For example, if we hear a

spectral flux, Peak based spectral smoothness, Area method

spoon hit a plate, we can say that the person is probably

of moments (10 features) and Area method of moments of

eating; if we hear the sound of pressing keyboard buttons, we

MFCCs (10 features). Those features were aggregated using

can say that the person is either at work or at home and is

mean and standard deviation. We also added 10 Area

using a computer. We developed a system that automatically

moments of Area method of moments of MFCC’s. This sums

detects everyday home activities based on sound.

up to 136 features. All features are explained in [4].

Figure 2 presents the process of activity recognition from

Since we have a lot of features, we use feature selection

sound data. Firstly, we gather data using a recording device,

algorithms.

such as a microphone. When recording, some privacy

Finally, we use supervised machine learning techniques to

protection should be taken into account (e.g. we could record

build classifiers for our data.

short sequences of time so we could not be able to recognize



spoken words). We propose recording for 5 minutes in the

following way: we record 200ms in every second for 1 minute

4. EVALUATION

and we do not record remaining 4 minutes.

In this section we present an evaluation of our experiment.



We gathered recordings from 3 persons in their everyday

living with smart phone's microphone. They labeled data with

the following activities: "Sleep", "Exercise", "Work",

"Eating", "Home - chores" and "Home - leisure".

Data was firstly intended for monitoring chronic patients.



27





We split data into training and test set. We were recording

Since we had 3 persons, we trained one best-performing

each person for 2 weeks and we used the first week as training

classifier (SMO) for each person. We got a confusion matrix

set and the second week as test set.

for each person and we summed all 3 matrices in one. It can

We extracted features using open-source library jAudio

be seen in Table 1. We can see that activity "Sleep" is almost

[4], [5].

flawless. We can also see that "Home - chores" is usually

For feature selection and machine learning we used the

misclassified as "Home - leisure", which could be a

open-source library Weka [6]. We used the feature selection

consequence of similar sounds produced in a person's home

algorithm ReliefF implemented in Weka on every person. We

during various activities. Due to the high number of instances

used 4

machine learning algorithms: SMO, J48,

labeled as “Work”, we got very good classification of

RandomForest and iBK, all with default parameters. We

"Work", but there are also many instances misclassified as

measured accuracies of all algorithms and then we used the

"Work". We must take into account that recorded persons

best-performing algorithm and we measured F-measures for

worked in the office, so many sounds are similar as in the

all the activities.

home environment. We can conclude that for different

activities, there can be many similar sounds, e.g. when person

reads a book at home ("Home - leisure"), there can be silence

78

as if the person took a nap ("Sleep"), so it is very challenging

for classifiers to achieve high accuracies.

76

smo

74

Table 1: Summed confusion matrix of all persons.

72

RandomForest





70





j48

g

re





rk

e –

res

e -

68

leep

ercise

o

tin

m

o

m

classified

IBK

S

x

a

o

W

o

eisu

66

E

E

H

ch

H

L

as

64

198

0

1

0

0

5

Sleep

62

0

48

40

0

0

0

Exercise



Figure 3: Average accuracy of all classifiers.

0

18

129

43

20

95

Work

0

As can be seen in Figure 3, the best performing algorithm

0

2

48

50

7

10

Eating

was SMO, which produced the highest accuracies on all the

10

6

92

3

66

44

Home - chores

tested persons. The average accuracy of SMO is 76 %. In the

7

0

61

5

31

216

Home - leisure

second and third place are RandomForest and iBK with the

average accuracies of 73 % and 69 % respectively. The worst

was j48 with the average accuracy of 68 %.

5. CONCLUSION

In Figure 4 we can see the average F-measure per activity

This paper presents a system and an approach to human

for the best performing algorithm SMO. The best recognized

activity recognition based on sound. The approach was tested

activity is "Sleep" with the average F-measure of 0.96,

on real-life recordings of three persons who annotated their

following by "Work" with 0.85. SMO detected "Eating" and

activity for 2 weeks.

"Exercise" relatively well with the average F-measure of

As outlined in Section 4 activity recognition from sound

0.46 and 0.43, respectively. The remaining average F-

on 1 minute intervals may be challenging. There may be

measures for "Home - leisure" and "Home - chores" were

complete silence during different kinds of activities (e.g.

0.38 and 0.26, respectively.

sleep, work) or the recording may be dominated by speech.



Therefore, it is difficult to achieve high accuracies in such

settings.

1

Nevertheless, activity recognition from sound may be used

0,8

for remote elderly-care. If we detect that the user was eating

at usual times during the day, even though we do not have

0,6

correct value about the eating period, we may conclude that

0,4

user's state is normal. Having reliable sleep recognition, we

0,2

may detect if the person is waking up during the night or if

0

the period of sleep is lengthening, both of which may indicate

a health problem. As future work, we need to record everyday

living activities of the elderly, and test the system's capability

to detect events that are critical for determining their health

state.



Figure 4: Average F-measure values of SMO.



28

References



[1] Eurostat, Retrieved September 2, 2014, from

http://epp.eurostat.ec.europa.eu/tgm/table.do?tab=table

&init=1&language=en&pcode=tsdde511&plugin=1.

[2] European commission, Horizon 2020, Health,

demographic change and well being. Retrieved

September 2, 2014, from

http://ec.europa.eu/research/participants/portal/doc/call/

h2020/common/1617611-part_8_health_v2.0_en.pdf

[3] M. Ziefle, S. Himmel, W. Wilkowska (2011) When your

living space knows what you do: Acceptance of Medical

Home Monitoring by Different Technologies, Lecture

Notes in Computer Science, pp. 607-624.

[4] McEnnis, Daniel, Ichiro Fujinaga, Cory McKay, Philippe

DePalle. 2005. "JAudio: A feature extraction library".

ISMIR.

[5] McEnnis, Daniel, Ichiro Fujinaga. 2006. "jAudio:

Improvements and additions". ISMIR.

[6] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard

Pfahringer, Peter Reutemann, Ian H. Witten (2009); The

WEKA Data Mining Software: An Update; SIGKDD

Explorations, Volume 11, Issue 1.





29





ARE HUMANS GETTING SMARTER DUE TO AI?



Matjaž Gams

Department of Intelligent Systems, Jožef Stefan Institute

Jamova cesta 39, 1000 Ljubljana, Slovenia

e-mail: matjaz.gams@ijs.si





ABSTRACT

it is not the case in mental tasks. Another pessimistic



viewpoint suggests that intelligent civilizations decline after

Humans are getting smarter due to use of tools, in history

reaching a certain development level (see Figure 1), possibly

because of mechanical tools and in recent decades due to

because of overpopulation, self-destruction or depletion of

information tools. The hypothesis in this paper goes a step

natural resources. This would explain why we have not yet

further: that we are getting smarter due to use of AI. The

detected alien civilizations, though the Drake’s equation [7]

thesis is indicated by solutions to three well-known logical

indicates that many such civilizations should exist. This

paradoxes that have been recently resolved by the author

remains an open question.

of this paper: the unexpected hanging paradox, the

Pinocchio paradox and the blue-eyes paradox. This paper

is a bit shorter version of the Informatica paper on the

same issue [19].



1. INTRODUCTION

Systematic measurements of the standard broad-spectrum IQ

tests improve each decade. According to the Flynn effect [1]

that humans are getting smarter and smarter. One theory

claims that the increase of human intelligence is related to the

use of information tools [2], which often progress

exponentially over time [3]. In this paper we go a step further

- that artificial intelligence (AI) influences human intelligence





in a positive way as other influencing factors. We illustrate



the hypothesis in Figure 1. The y axis is logarithmic in the

Figure 1: Current and predicted growth of computer and

scale. Therefore, the linear growth of computer skills on the

human computing capabilities to solve problems.

graph corresponds to the exponential nature of Moore’s law

[4]. Basic human physical and mental characteristics, such as



speed of movement, coordination or speed of human

How to indicate that AI helps improve humans think

computing, have remained nearly constant in recent decades,

better? If we can show that humans can solve logical puzzles

as represented by the horizontal line in Figure 1.

that they were not able to solve until recently without

Our first thesis is that the ability of humans to solve

computers, that would be a good indication of humans getting

problems increases due to information tools such as

smarter on their own. An objection might be that just one

computers, mobile devices with advanced software, and AI in

solution of one puzzle is far too little to show anything.

particular (the bold top line in the Figure 1). Programs such

However, an indication it might be – at least to start a debate.

as the Google browser may provide the greatest knowledge

To demonstrate the idea, we analyze the unexpected

source available to humans, thereby representing an extension

hanging paradox [8, 9, 10, 11] and shortly mention a couple

of our brains as do calculators in the field of arithmetic.

of other logical paradoxes.

The stronger and more provocative hypothesis that



humans are getting smarter on their own due to the AI

2. THE LIAR PARADOX

comprehensions. After all, AI is about intelligence. In the AI

For an introduction to logical paradoxes we quickly

community [5], it is generally accepted that AI progress is

investigate the liar paradox, first published in [12]. According

increasing and might even enable human civilization to take

to [13] it was first formulated by the Greek philosopher

a quantitative leap [6].

Eubulides: “A man says that he is lying. Is what he says true

Several opposing theories claim that humans actually

or false?” This sentence is false when it is true. These days,

perform worse on their own, since machines and tools have

the paradox is usually presented in the form “This sentence is

replaced humans’ need to think on their own. We argue that

false.”

while this effect may be valid for human physical properties

But today it is generally accepted that there is no true

related for example to obesity due to lack of physical activity,

paradox, since the statement is simply false [14]. The

30

contradiction is of the form “A and not A,” or “It is true and

established [9]. This is a paradox about a person’s

false.” In other words, if a person always lies by definition,

expectations about the timing of a future event that they are

then that person is allowed to say only lies. Therefore, such

told will occur at some unexpected time [16]. The paradox

statements are simply not allowed, which means they are

has been described as follows [9]:

false.

A judge tells a condemned prisoner that he will be hanged

We presented the liar paradox to analyze why humans

at noon on one weekday in the following week but that the

had troubles with it before and why now it is seen as a trivial

execution will be a surprise to the prisoner. He will not know

case. When faced with the liar paradox for the first time,

the day of the hanging until the executioner knocks on his cell

humans fall into a loop of true/untrue derivations without

door at noon that day. Having reflected on his sentence, the

observing that their thinking was already falsified by the

prisoner draws the conclusion that he will escape from the

declaration of the problem. It seems a valid logical problem,

hanging. His reasoning is in several parts. He begins by

so humans apply logical reasoning. However, the declaration

concluding that the "surprise hanging" can't be on Friday, as

of the logical paradox was illogical at the start rendering

if he hasn't been hanged by Thursday, there is only one day

logical reasoning meaningless.

left - and so it won't be a surprise if he's hanged on Friday.

In analogy, 1 + 1 = 2, and we all accept this as a true

Since the judge's sentence stipulated that the hanging would

sentence without any hesitation. Yet, one liter of water and

be a surprise to him, he concludes it cannot occur on Friday.

one liter of sugar do not combine to form two liters of sugar

He then reasons that the surprise hanging cannot be on

water. Therefore, using common logic/arithmetic in such a

Thursday either, because Friday has already been eliminated

task is inappropriate from the start.

and if he hasn't been hanged by Wednesday night, the hanging

Which AI methods help us better understand such

must occur on Thursday, making a Thursday hanging not a

paradoxes? The principle and paradox of multiple-knowledge

surprise either. By similar reasoning he concludes that the

[15] tentatively explain why humans easily resolve such

hanging can also not occur on Wednesday, Tuesday or

problems as the liar paradox. We use multiple ways of

Monday. Joyfully he retires to his cell confident that the

thinking not only in parallel, but also with several mental

hanging will not occur at all. The next week, the executioner

processes interacting together during problem-solving.

knocks on the prisoner's door at noon on Wednesday —

Different processes propose different solutions, and the best

which, despite all the above, was an utter surprise to him.

one is selected. The basic difference in multiple-knowledge

Everything the judge said came true.

viewpoint compared to the classical ones occurs already at the

Evidently, the prisoner miscalculated, but how?

level of neurons. The classical analogy of a neuron is a simple

Logically, the reasoning seems correct. While there have been

computing mechanism that produces 0/1 as output. In the

many analyses and interpretations of the unexpected hanging

multiple viewpoint, each neuron outputs 2N possible

paradox, there is no generally accepted solution. The paradox

outcomes, which can be demonstrated if N outputs from a

is interesting to study because it arouses interest in both

single neuron are all connected to N inputs of another neuron.

laymen and scientists. Here, we provide a different analysis

In summary, the multiple-knowledge principle claims that the

based on the viewpoint of cooperating AI agents [5, 16],

human computing mechanism at the level of a neuron is

contexts and multiple knowledge [15]. It might be the case

already much more complex than commonly described, and

that similar solutions were presented before, but it seems that

even more so at the level of higher mental processes.

AI knowledge disregards potential complications and

Therefore, humans have no problems computing that one

provides a simple solution.

apple and one apple are two apples, and one liter of water and

First we examine, which events are repeatable and which

one liter of sugar is 1.6 liters of liquid and a mass of 2.25

irreversible. The prediction of hanging on one out of five

kilograms, since they use multiple thinking. It is only that a

possible days is well defined through a real-life empirical fact

person who logically encounters the sugar-water merge for

of a human life being irreversibly terminated. However, the

the first time may claim that it will result in 2 liters of sugar

surprise is less clearly defined. If it denotes cognitive surprise,

water. However, after an explanation or experiment, humans

then the prisoner can be sure that the hanging will take place

comprehend the problem and have no future problems of this

on the current day. No surprise is assured each new day, even

kind.

on the first day, so hanging under the given conditions is not

Another AI solution at hand uses contexts. In arithmetic,

possible. Such an interpretation makes no sense. To avoid the

1 + 1 = 2. In merging liquids and solid materials, 1 + 1 ≠ 2. In

prisoner

being

cognitively

certain,

the

following

the first case, the context was arithmetic and in the second

modifications are often proposed [9]:

case, merging liquids and solid materials. The contexts enable

The prisoner will be hanged next week, and the date (of

an important insight into the paradoxes such as the

the hanging) will not be deductible in advance from the

unexpected handing paradox.

assumption that the hanging will occur during the week (A).



The prisoner will be hanged next week and its date will not be

3. THE UNEXPECTED HANGING PARADOX

deducible in advance using this statement as an axiom (B).

The unexpected hanging paradox, also known as the hangman

Logicians are able to show that statement (B) is self-

paradox, the unexpected exam paradox, the surprise test

contradictory, indicating that in this interpretation, the judge

paradox, or the prediction paradox, yields no consensus on its

uttered a self-contradicting statement leading to a paradox.

precise nature, so a final correct solution has not yet been



Chow [10] presents a potential explanation through

31

epistemological formulations suggesting that the unexpected

into contradiction.) Even if the prisoner is allowed to deposit

hanging paradox is a more intricate version of Moore’s

the one and only coupon on any day in the week, there is no

paradox [9]:

major difference in terms of explanation in this paper. Again,

A suitable analogy can be reached by reducing the length

if the prisoner is allowed to deposit the coupon each day

of the week to just one day. Then the judge’s sentence

anew, this formulation makes no sense.

becomes: “You will be hanged tomorrow, but you do not

We can further explain the error in the prisoner’s line of

know that.”

reasoning by assuming that instead of giving his ruling five

Now we can apply AI methods to analyze the paradox.

days in advance, he gave it on Thursday morning, leaving a

First, the judge’s statement is a one-sided contract from an AI

two-day opportunity. Since the prisoner could use the single

agent viewpoint, defining a way of interacting and

pardon (remember: deducible for a one-time event means one

cooperating. As with any agreement/contract, it also has some

prediction once) and save himself on Friday, he concludes

mechanisms defining the consequences if one side violates

that Thursday is the only day left and cashes in his only

the agreement. Since the judge unilaterally proclaimed the

coupon with a 100 percent certain logical explanation on

agreement, he can even violate it without any harm to him,

Thursday. However, in this case the judge could carry out the

whereas the prisoner’s violations are punished according to

hanging on Friday. Why? Because the prisoner provided the

the judge’s will and corresponding regulations. For example,

only 100 percent certain prediction in the form of a single life-

if the prisoner harms a warden, the deal is probably off, and

saving coupon on Thursday, which means that on Friday he

the hanging can occur at the first opportunity, regardless of

could not deliver the coupon. In other words, the prisoner

whether it is a surprise. This is an introductory indication that

wrongly predicted the hanging day and therefore violated the

the hanging paradox is from the real world and that it matters,

agreement.

and is not just logical thinking. Even more important, it

The situation on Thursday is similar to the situation on

enables a valid conclusion that any error in prisoner’s

Monday. Even if the judge knocks on the door on Thursday,

actions releases the judge from his promise.

and the prisoner correctly predicted Thursday, he still could

Since the judge is the interpreter of the agreement, he can

not provide a 100 percent certain explanation why the

accept the weird viewpoint that it suffices that the prisoners

hanging would occur on Thursday since the judge could come

claims a surprise to be released. However, the judge is

back on Friday as described in the above text; therefore, the

supposed to be a smart person and there is no sense in such a

judge can proceed also on Thursday or Friday without

viewpoint. The judge is also supposed to be an honest person

violating his proclamation.

and as long as the prisoner abides to the appropriate behavior,



the judge will keep his word and presumably postpone the

4. DISCUSSION

execution if the prisoner predicts the exact day of the hanging.

Wikipedia offers the following statement regarding the

Now, we come to the crucial reasonable definition of

unexpected hanging paradox [9]:

ambiguity, defined by the smart and honest judge. The term

There has been considerable debate between the logical

deducible now means that the prediction will be 100 percent

school, which uses mathematical language, and the

guaranteed accurate about a one-time event (that is, hanging),

epistemological school, which employs concepts such as

so such a prediction can be uttered only once a week, not

knowledge, belief and memory, over which formulation is

each day anew. Therefore, the prisoner has exactly one

correct.

chance of not only predicting, but also explaining with

According to other publications [8], this statement

certainty to the judge, why the hanging will occur on that

correctly describes the current state of scientific literature and

particular day. The judge will have to be persuaded; that is,

the human mind.

he will have to understand and accept the prisoner’s line of

To some degree, solutions similar to the one presented in

reasoning. If not, the deal is off and the judge can choose any

this paper have already been published [8, 9]. However, they

day while still keeping his word.

have not been generally accepted and, in particular, have not

For easier understanding, consider that the prisoner is

been presented through AI means. Namely, AI enables clearer

given a life-saving coupon on which he writes the predicted

explanation such as:

day and stores it in the judge’s safe on Monday morning with

The error in the prisoner’s line of reasoning occurs when

the explanation attached. Obviously, the prisoner stands no

extending his induction from Friday to Thursday, as noted

chance if the judge orders handing on Monday. Namely, if the

earlier, but the explanation in this paper differs. The correct

prisoner proposes Monday, he cannot provide a deducible

conclusion about Friday is not:

explanation why the handing will happen on Monday. Yes, he

“Hanging on Friday is not possible” (C), but:

will not be surprised in cognitive terms, but both a correct

“If not hanged till Friday and the single prediction with

prediction and a deducible explanation are required in order

explanation was not applied for any other day before, then

to avoid hanging. The only chance to avoid hanging is to

hanging on Friday is not possible.” (D)

predict Friday and hope that he will not be hanged till Friday.

The first condition in (D) is part of common knowledge.

(In this case, the judge could still object that, on Monday for

The second condition in (D) comes from common sense about

example, the prisoner could not provide a plausible

one-sided agreements: every breach of the agreement can

explanation for Friday. Yet, that would not be fair since, on

cause termination of it. The two conditions reveal why

Friday, the prisoner would indeed be sure of the judge coming

humans have a much harder time understanding the hanging

32

paradox, compared to the liar paradox. The conditions are

References

related to the concepts and interpretation of time and

[1] Teasdale, T W, Owen, D. R. (2005). A long-term rise and

deducibility and should be applied simultaneously, whereas

recent decline in intelligence test performance: The Flynn

only one insight is needed in the liar paradox. In AI, this

Effect in reverse. Personality and Individual Differences,

phenomenon is well known as the context-sensitive

Volume 39, Issue 4, September 2005, Pages 837–843,

reasoning in agents, which was first presented in [18] and has

Elsevier.

been used extensively in recent years. Here, as in real life,

[2] Flynn, J. R. (2009). What Is Intelligence: Beyond the

under one context the same line of reasoning can lead to a

Flynn Effect. Cambridge, UK: CambridgeUniversity

different conclusion compared to the conclusion under

Press.

another context (remember the sugar water). But one can also

[3] Computing laws revisited (2013). Computer 46/12. [4]

treat the conditions in statement (D) as logical conditions, in

Moore, G.E. (1965). Cramming more components onto

which case the context can serve for easier understanding.

integrated circuits. Electronics Magazine, 4.

The same applies to the author of this paper: Although he has

[5] Proceedings of the Twenty-Third International Joint

been familiar with the hanging paradox for decades, the

Conference on Artificial Intelligence (IJCAI’13) (2013).

solution at hand emerged only when the insight related to the

Beijing, China.

contexts appeared.

[6] Kurzweil, R. (2005). The Singularity is Near. New

Returning to the motivation for analysis of the

York: Viking Books.

unexpected hanging paradox, the example was intended to

[7] Dean, T. (2009). A review of the Drake equation.

show that humans have mentally progressed to see the trick

Cosmos Magazine.

in the hanging paradox, similar to how people became too

[8]

Wolfram

A.

(2014).

http://mathworld.

smart to be deceived by the liar paradox.

wolfram.com/UnexpectedHangingParadox.html

This new approach has also been used to solve several

[9] Unexpected hanging paradox, Wikipedia (2014).

other paradoxes, such as the blue-eyes paradox and the

https://en.wikipedia.org/w/index.php?title=Unexpected_han

Pinocchio paradox. Analyses of these paradoxes are being

ging_paradox&oldid=611543144, June 2014

submitted to other journals.

[10] Chow, T.Y. (1998). The surprise examination or

In summary, the explanation of the hanging paradox and

unexpected hanging paradox. American Mathematical

the difficulty for human paradox solvers resembles those of

Monthly 105:41–51.

the liar paradox before solving it beyond doubt. It turns out

[11] Sober, E. (1998). To give a surprise exam, use game

that both paradoxes are not truly paradoxical; instead, they

theory. Synthese 115:355–73.

describe a logical problem in a way that a human using logical

[12] O’Connor, D.J. (1948). Pragmatic paradoxes. Mind 57:

methods cannot resolve the problem. Similar to the untrue

358–9.

assumption that a liar can utter a true statement, the

[13] Beall, J.C., Glanzberg, M. (2013). In Edward N.

unexpected hanging paradox in the prisoner’s line of

Zalta, E.N. (eds.), The Stanford Encyclopedia of Philosophy.

reasoning exploits two misconceptions. The first is that a 100

[14] Prior, A.N. (1976). Papers in Logic and Ethics.

percent accurate prediction for a single event can be uttered

Duckworth.

more than once (through a vague definition of “surprise”) and

[15] Gams, M. (2001). Weak Intelligence: Through the

the second that a conclusion that is valid at one time is also

Principle and Paradox of Multiple Knowledge. New York:

valid during another time span. Due to the simplicity of the

Nova Science Publishers, Inc.

AI-based explanation in this paper, there is no need to provide

[16] Sorensen, R. A. (1988). Blindspots. Oxford, UK:

additional

logical,

epistemological,

or

philosophical

Clarendon Press.

mechanisms to explain the failure of the prisoner’s line of

[17] Young, H.P. (2007). The possible and the impossible in

reasoning.

multi-agent learning. Artificial Intelligence 171/7.

This paper provides an AI-based explanation of the

[18] Turner, R.M. (1993). Context-sensitive Reasoning for

hanging paradox for humans in natural language, while

Autonomous Agents and Cooperative Distributed Problem

formal explanations remain a research challenge. The formal

Solving, In Proceedings of the IJCAI Workshop on Using

analysis have already been designed for the Pinocchio

Knowledge in its Context, Chambery, France.

paradox whereas the blue-eyed paradox has not yet been

[19] Gams, M (2014). The Unexpected Hanging Paradox

formally explained, only in a way similar to this paper.

from an AI Viewpoint, Informatica 38, 181–185.





Acknowledgements


The author wishes to thank several members of the



Department of Intelligent Systems, particularly Boštjan

Kaluža, Mitja Luštrek, and Tone Gradišek for their valuable

remarks. Special thanks are also due to Angelo Montanari,

Stephen Muggleton, and Eva Černčic for contributions on this

and other logic problems.

33

DEVELOPING A SENSOR FIRMWARE APPLICATION

FOR REAL-LIFE USAGE

Hristijan Gjoreski, Mitja Luštrek, Matjaž Gams

Department of Intelligent Systems, Jožef Stefan Institute,

Jožef Stefan International Postgraduate School,

e-mail: {hristijan.gjoreski, mitja.lustrek, matjaz.gams}@ijs.si

ABSTRACT

firmware TinyOS application was developed in order to

satisfy the user’s requirements and the sensors limitations. It

In recent years the demand for intelligent systems that

implements two modes of operation: real-time data sending

support the life of the elderly is increasing. In order to

and logging the data in the internal memory and sending it

provide an appropriate support, these systems should

offline in batches.

constantly monitor the user with sensors. However,

The motivation and the context of the study is the

using sensors in real-life situations is a challenging task,

CHIRON project (Cyclic and person-centric health

mainly because of the constraints in the sensor energy

management: Integrated approach for home, mobile and

consumption (battery life) and memory capacity for

clinical environments) [4]. It is a European research project

storing the sensors data. In this paper we present an

of the ARTEMIS JU Program with 27 project partners. It

example of a sensor communication protocol developed

includes industry partners (large companies and SMEs),

for the Shimmer accelerometers, so that they can be

research and the academic institutions, and also medical

used in in real-life situations, i.e., constantly monitoring

institutions. The project addresses one of the today’s

the user during a normal day. A custom firmware

application

is

developed,

which

has

several

societal challenges i.e., “effective and affordable healthcare

functionalities: real-time data streaming through

and wellbeing”. CHIRON combines state-of-the art

Bluetooth, data logging into internal microSD card,

technologies and innovative solutions into an integrated

sending the stored data to a Bluetooth-enabled device

framework of embedded systems for effective and person-

and detecting when the sensor is put on and off a

centered health management throughout the complete

charging dock.

healthcare cycle, from primary prevention (person still

healthy) to secondary prevention (initial symptoms or

1 INTRODUCTION

discomfort) and tertiary prevention (disease diagnosis,

treatment and rehabilitation) in various domains: at home,

The world’s population is aging rapidly, threatening to

in nomadic environments and in clinical settings.

overwhelm the society’s capacity to take care of its elderly

members. The percentage of persons aged 65 or over in

2 SENSORS

developed countries is projected to rise from 7.5% in 2009

to 16% in 2050 [1]. This is driving the development of

In the CHIRON project, two Shimmer accelerometers are

innovative ambient assisted living (AAL) technologies to

used to monitor the user’s activities. The sensor platform is

help the elderly live independently for longer and with

based on the Shimmer Wireless Sensor Network (WSN)

minimal support from the working-age population [2][3].

module.

It

is

based

on

a

T.I.

MSP430F1611

To provide timely and appropriate assistance, AAL systems

microcontroller, which operates at a maximum frequency of

must monitor the user by using ambient (environmental)

8 MHz and is equipped with 10Kb RAM and 48 Kb of

and/or wearable sensors. With the recent development of

Flash. Wireless communication is achieved either with

the sensors technology, the wearable sensors are gaining

Bluetooth v2 (BT − RN-42 module) or through IEEE

attraction and can measure lot of different user-related

802.15.4 (T.I. CC2420 module.). In our study we used the

parameters: location, activity, physiological, etc. Examples

standard BT v2 in order to easily connect it with a

of them include: GPS, accelerometers, gyroscopes, heart-

smartphone. For storage purposes, the Shimmer platform is

rate sensors, breath-rate sensors, etc.

equipped with an integrated 2GB microSD card, which is

In order to be used in everyday life situations, these

used in normal operation mode to store sensor readings [5].

sensors have to able to constantly monitor the user during

The power supply is comprised of a 450mAh rechargeable

the day. However, often this is a challenging task mostly

Li-ion battery.

due to battery consumption constraints and memory storage

The firmware of the Shimmer platform is based on the

capacity. In this study we present a sensing protocol for the

open-source TinyOS operating system [6]. It uses the NesC

Shimmer accelerometer sensors, so they can be used to

programming language, which is a light-weight version of

constantly monitor the user during the day. A custom

C. TinyOS/NesC is dedicated for low-power wireless



34

sensors and allows many sensor platforms with a

between the smartphone and the sensors, which is highly

heterogeneous set of hardware devices to be programmed

unlikely in real-life scenario. In order to achieve these

and controlled (microcontroller, sensors, SD cards, etc.).

functionalities we created a custom firmware application

The TinyOS in the Shimmer sensors follows three-layer

which has two modes of operations: real-time data sending

abstraction architecture. At the bottom is the Hardware

and logging the data in microSD card and sending it for

Presentation Layer (HPL) which allows access to

offline analysis. The application is based on the two

input/output pins or registers of the hardware devices. Next,

standard Shimmer firmware applications, which are publicly

the Hardware Abstraction Layer (HAL) allows configuring

available: real-time data acquisition ("BoilerPlate.ihex")

more complex functionality in order to communicate with

and data logging ("JustFATLogging.ihex").

external sensors or resources implemented in the platform.

The original logging application (JustFATLogging.ihex)

The top layer is the Hardware Independent Layer (HIL),

has one main function, to log the acceleration data on the

which permits to read the sensor data independently of the

microSD card. The start of the logging is triggered when the

digital communication bus.

sensor is removed from a dock station and the end of

Each layer communicates with the adjacent ones trough

logging is triggered when the sensor is put back on the dock

interfaces, either generic or hardware specific. As the

station. The data can be accessed only through the USB port

TinyOS is an event-driven operating system, the interface

of the docking station. We used this firmware application as

call commands that are addressed to the lower layer. These

a base for further development.

commands are answered from the lower layers by signaling

First, we added the Bluetooth functionality in order to

events. In our case, the Shimmer sensor platform, the HPL

allow wireless communication between the sensor and the

and the HAL layers are already available and for its internal

smartphone. However, the activation of the Bluetooth

resources, as the accelerometer, the SD card or the

significantly decreased the sensor's battery life. Therefore,

Bluetooth radio, the HIL layer is also implemented.

we modified the application so the Bluetooth is activated

Once the layers are implemented, a firmware application

only when the sensor is put back on the charger. During the

is developed. In our case, the application is based on the

charging time, the smartphone sends a command and

specifications (sensing protocol) provided by doctors in the

collects the logged data. When the user decides to mount

CHIRON project. The firmware application and the sensing

the sensors he/she gives a command to start logging and to

protocol are discussed in the next section.

turns off the Bluetooth. Thus, during the logging process the

Bluetooth is off and there is no communication between the

3 SENSING PROTOCOL AND FIRMWARE

smartphone and the sensor.

APPLICATION

For the acceleration data, it is really important how the

sensor is mounted, i.e., the sensor orientation must be the

In order to explain the sensing protocol (shown in Figure 1),

same for every recording. In order to check the orientation

let us consider the following scenario. The user wakes up

of the sensors, an algorithm analyzes the data during some

and takes the two accelerometers from the charging dock.

predefined activities, e.g., standing and lying, and

Once they are taken out from the dock, the sensors have to

accordingly gives a notification to the user if the sensors are

start sensing. The user attaches the sensors in the wearable

mounted in the correct way. To allow this data analysis, the

garment (e.g., chest and thigh elastic straps) and performs

data has to be analyzed in real-time, therefore we added

an initialization activity sequence. This sequence is

also this functionality, i.e., real-time transmission. The real-

performed in order to ensure if the sensors have the right

time data acquisition is performed before the start logging

orientation (important for the post-processing of the data).

command is sent.

The orientation checking lasts for a few minutes, during

In addition, two more functionalities were implemented:

which the sensor data is streamed in real-time to a

deleting a log file ( delete log), and checking the availability

smartphone application. Once the smartphone confirms that

of a log file ( is log available).

the setup is all right, the user continues with his everyday

The final modification is related to the timestamps of the

activities. During this period the sensors log the data locally

data samples and data synchronization between different

to a microSD card. At the end of the day, the user takes out

sensors. In order to synchronize the data between the

the sensors, puts them to the charging docks, and goes to

sensors, one must know the absolute timestamp of the data

sleep. During the sleep, the sensors are charged and all the

samples. In our case, we used the timestamp of the start of

data is transferred to the processing unit.

the logging and the time difference between consecutive

The scenario shows that the battery life of the sensors

data samples. The sensor's internal crystal clock is used for

should last at least 16 hours (the active period of a normal

estimating the time difference between consecutive data

day) and the sensors should be able to receive commands

samples. Thus, each data sample is labeled with a timestamp

from a smartphone through Bluetooth. Our tests showed that

provided by the clock. The starting timestamp is sent by the

if the standard firmware application is used (real-time data

smartphone with each start logging command. Using the

sending using BT), the battery will last around 6 hours,

starting timestamp and the internal counter's timestamps, the

which is not sufficient for the whole day. Furthermore, with

smartphone was able to reconstruct an absolute timestamp

this approach there should be a constant BT connection

for each data sample.



35





Sensor manual

Docking event

restart

(the sensor is put on the charger)

Sensor events

Sensor commands

Turn the Bluetooth on

Yellow LED on

Ready state

Start

Turn off

(Bluetooth is on, sensor can accept commands)

logging

all LEDs

Real-time

Send log

transmission

Is log available

Delete log

Receive absolute

timestamp from the

Red LED on

Check log

Check log

Check log

smartphone

Turn the

Start sampling

Send the status

Is log

Is log

No

Bluetooth off

Timer

through BT

available?

available?

Green LED off

Yes

Create new log file

Write Acceleration

Yes

on the SD card

data to SD card

Timer event

Delete log

Send the logged

data through BT

Start sampling Timer

Green LED on

Sample Acceleration

Green LED on

data

Yes

Timer event

Send Acceleration

Sample Acceleration

Time to log

sample through BT

data

Acceleration

data?

No



Figure 1. Sensor Firmware Flowchart.

In theory, the internal Shimmer crystal clock (Epson FC-

sending, and waits for the absolute

135 32.7680KA-A3) tolerance is ±20 ppm, which results in

real timestamp from the smartphone.

1.8 seconds maximal drift in 24 hours. In the worst case

This timestamp is written at the

scenario, when two sensors have different drift direction (+

beginning of each log file and is used

or -), the time difference is 3.6 seconds, which is acceptable

as a reference point for reconstructing

for the project's requirements. Several practical tests were

the timestamp for each data sample.

performed and confirmed the theoretical analysis, i.e., the

After that, the sensor starts logging

measured drift was in the range of 1 second for a whole day

the accelerometer data. Additionally,

recording.

the Bluetooth is turned off; there is no

communication between the sensor

4 SENSOR COMMANDS AND EVENTS

and the smartphone during the logging

process.

Table 1 describes the commands that can be received by the

The sensors checks if the log file is

sensor application firmware. These commands can be sent

Is log

available for sending and sends the

by any BT-equipped device.

available

status.

Table 1. Sensor commands.

The sensor sends the logged file. First,

the absolute timestamp is sent, and

Send log

The sensor samples accelerometer

then the accelerometer data samples

Real-time

data at 50Hz and sends each sample to

are sent.

transmission

the smartphone.

Delete log

The sensor deletes the log file.

Start logging

The sensor stops the Real-time



36

Table 2 describes the events that are detected by the

Table 4. Average working time for the real-time (Bluetooth

application firmware.

is active) and the logging mode (Bluetooth is not active; the

sensor is logging in the SD card).

Table 2. Events that can be detected by the sensor’s

application firmware

Sampling

Real-time mode

SD-logging

Frequency (Hz)

mode

Docking event

The sensor stops the logging and

50

6h 30m

14 days

(the sensor is

turns on the Bluetooth. Yellow LED

put on the

is turned on, representing that the log

6 CONCLUSION

charging dock)

file is ready to be sent.

The sensor restarts to the initial state.

In this paper we showed how one can overcome the sensor

That is, stops the logging and turns

limitations (battery life and memory storage) by creating a

Sensor manual

on the Bluetooth. Yellow LED is

custom firmware application and adjusting it to real-life

restart

turned on, representing that the log

situations. We presented a sensing protocol and sensor

file is ready to be sent.

firmware

application

developed

for

the

Shimmer

accelerometers. The protocol was created so that the sensors

5 THEORETICAL AND EMPIRICAL TESTS

can be used in in real-life situations, i.e., constantly

monitoring the user during a normal day. The developed

After developing the firmware application we performed

custom firmware application has several functionalities: real-

several theoretical and empirical tests. First, we analyzed

time data streaming through Bluetooth, data logging into

the amount of data expected to be generated on a daily

internal microSD card, sending the stored data to a

basis. The MSP430 A/D channels perform 12-bit (2 bytes

Bluetooth-enabled device, and detecting when the sensor is

of storage) digitization and that a 16 bit (2 bytes) timestamp

put on and off a charging dock.

is stored for each sample. Table 2 summarizes our

projections based on the sampling frequency of every

Acknowledgement

sensor. Based on these calculations the total amount of data

for 12-16h of daily use should be 114Mb – 152.4Mb. This

This work was partly supported by the Slovene Human

amount of data does not pose any issue in any operational

Resources Development and Scholarship funds and partly by

mode, since in the real-time scenario BT can achieve data

the CHIRON project - ARTEMIS Joint Undertaking, under

rates up to 300kbps which is more than adequate for the

grant agreement No. 2009-1-100228.

amount of data generated per second and in the logging



operating mode the microSD cards on the modules have

References

more than enough capacity to store the generated data.



[1] United Nations 2009, World population ageing, Report

Table 3. The amount of data generated by the accelerometer.

[2] A. Bourouis, M. Feham, A. Bouchachia, "A new

Sampling

Data per second

Data per hour

architecture of a ubiquitous health monitoring system: a

prototype of cloud mobile health monitoring system,"

Frequency (Hz)

(KB/s)

(MB/h)

The Computing Research Repository, 2012.

50

0.78

2.74

[3] M. Luštrek, B. Kaluža, B. Cvetković, E. Dovgan, H.



Gjoreski, V. Mirchevska, M. Gams, "Confidence:

ubiquitous care system to support independent living"

The energy consumption analysis of the Shimmer

DEMO

at

European

Conference

on

Artificial

platform, presented by Burns et. al. [5], designates that the

Intelligence, pp. 1013-1014, 2012.

accelerometer draws 1.6mA when sampled at 50Hz. When

[4] The CHIRON project: http://www.chiron-project.eu/

the sensor streams accelerometer data in real-time through

the BT the consumption increases to 5.2 mA. From this

[5] A. Burns, B. R. Greene, M. J. McGrath, T. J. O'Shea, B.

Kuris, S. M. Ayer, F. Stroiescu, V. Cionca,

analysis, it is safe to assume, that since the same hardware

"SHIMMER™ – A Wireless Sensor Platform for

equipped with a 450mAh battery is used the clinical

Noninvasive Biomedical Research," IEEE Sens. J.,

requirement of 6-8h data logging (in the storing mode) or an

vol.10, no.9, pp.1527- 1534, 2010.

adequate amount of time (around 1h) for live streaming

[6] P. Levis, S. Madden, J. Polastre, R. Szewczyk, K.

(streaming mode) can be easily met, provided that the

Whitehouse, A. Woo, D. Gay, J. Hill, M. Welsh, E.

Brewer, D. Culler. TinyOS: An operating system for

module’s batteries are fully charged at the beginning of

sensor networks. In Werner Weber, JanM Rabaey, and

sensing. Table 4 lists the average battery lifetime (full

Emile Aarts, editors, Ambient Intelligence, chapter 7,

battery drainage period) obtained for the two modes of

pp. 115–148, 2005.

operation from a series of experiments.



37

AUTOMATIC RECOGNITION OF EMOTIONS

FROM SPEECH



Martin Gjoreski 1 , Hristijan Gjoreski 2 , Andrea Kulakov 1

1Faculty of Computer Science and Engineering, Rugjer Boshkovikj 16, 1000 Skopje, Macedonia;

2Department of Intelligent Systems, Jožef Stefan Institute, Jamova cesta 39, 1000 Ljubljana, Slovenia

e-mail: martin.gjoreski@gmail.com, hristijan.gjoreski@ijs.si, andrea.kulakov@finki.ukim.mk





ABSTRACT

approaches the voice and the spoken words are analyzed [2].



Some are focused only on the facial expressions [3]. Some

This paper presents an approach to recognition of

human emotions from speech. Seven emotions are

are analyzing the reactions in the human brain for different

recognized: anger, fear, sadness, happiness, boredom,

emotional states [4]. Also there are combined approaches

disgust and neutral. The approach is applied on a

where combination of the mentioned approaches is used [5].

speech database, which consists of simulated and

In studies where human emotions are analyzed mainly two

annotated utterances. First, numerical features are

methodologies are used. In the first methodology the

extracted from the sound database by using audio

emotions are viewed as discrete and completely distinct

feature extractor. Next, the extracted features are

classes that are universally recognized [6]. In the second

standardized. Then, feature selection methods are used

methodology the emotional states are represented in 2D or

to select the most relevant features. Finally, a

3D space where parameters like emotional distance, level of

classification model is trained to recognize the emotions.

activeness, level of dominance and level of pleasure can be

Three classification algorithms are tested, with SVM

observed [7]. In this research the discrete methodology will

yielding the highest accuracy of 89% and 82% using the

be used, so the emotional states will be represented as 7

10 fold cross-validation and Leave-One-Speaker-Out

classes: anger, fear, sadness, happiness, boredom, disgust

techniques, respectively. “Sadness” is the emotion which

and neutral.

is recognized with highest accuracy.

The remainder of this paper is organized as follows. Next



section is a brief overview of speech emotion analysis. Then,

1 INTRODUCTION

the methodology used for the process of emotion



classification is presented. In the next section, the

Human capabilities for perception, adaptation and learning

experimental setup and the results are presented. Finally, the

about the surroundings are often three main compounds of

conclusion and a brief discussion about the results is given.

the definition about what intelligent behavior is. In the last



few decades there are many studies suggesting that one very

2 SPEECH EMOTION ANALYSIS

important compound is left out of this definition about



intelligent

behavior.

That

compound

is

emotional

Speech emotion analysis refers to usage of methods to

intelligence. Emotional intelligence is the ability of one to

extract vocal cues from speech as a marker for emotional

feel, express, regulate his own, to recognize and handle the

state, mood or stress. The main assumption is that there are

emotional state of others. In psychology the emotional state

objectively measureable cues that can be used for predicting

is defined as complex state that results in psychological and

the emotional state of the speaker. This assumption is quite

physiological changes that influence our behaving and

reasonable since the emotional states arouse physiological

thinking [1].

reactions that affect the process of speech production. For

With the recent advancements of the technology and the

example, the emotional state of fear usually initiates rapid

growing research areas like machine learning, audio

heartbeat, rapid breathing, sweating and muscle tension. As

processing and speech processing, the emotional states will

a result of these physiological activities there are changes in

be inevitable part of the human-computer interaction. There

the vibration of the vocal folds and the shape of the vocal

are more and more studies that are working on providing the

tract. All of this affects the vocal characteristics of the

computers with abilities like recognizing, interpretation and

speech which allows to the listener to recognize the

simulation of emotional states.

emotional state that the speaker is experiencing [8]. The

In this research we present an approach for automatic

basic speech audio features that are used for speech emotion

recognition of emotions from speech. The goal is to

recognition are: fundamental frequency (human perception

recognize the emotional state that is experiencing the

for fundamental frequency is pith), power, intensity (human

speaker. Furthermore, the focus is on how something is said,

perception for intensity is loudness), duration features (ex.

and not what is said. Besides this approach where only the

rate of speaking) and vocal perturbations. The main

speaker’s voice is analyzed, there are several different

question is: Are there any objective voice feature profiles

approaches for recognizing the emotional state. In some

that can be used for speaker emotion recognition? A lot



38

studies are done for the sake of providing such feature

the expressed emotions. Also the low quality of the audio

profiles that can be used for representation of the emotions,

can be problem.

but not always the results are consistent. For some basic

For this research the Berlin emotional speech database [12]

problems like distinguishing normal speech from angry

is used. It consists of 535 audio files, where 10 actors (5

speech or distinguishing normal speech from bored speech

male and 5 female) are pronouncing 10 sentences (5 short

the experimental results converge [9]. The problem arises

and 5 long). The sentences are chosen so that all 7 emotions

when we have to distinguish emotional states like anger

that we are analyzing can be expressed. The database is

from happiness or fear from happiness. By using the basic

additionally checked for naturalness by a human expert. The

speech audio features for describing these emotional states,

utterances that were rated with more than 60% naturalness

the feature profiles will be quite similar so distinguishing

and from which the expressed emotion was recognized with

them is hard.

more than 80%, were included in the final database.

In the last few years, new method is introduced where static



feature vectors are obtained by using so called acoustic

3.2 Feature Preparation

Low-Level Descriptors (LLDs) and descriptive statistical



The feature extractor tool used in this research is

functionals [10]. By using this approach a big number of

openSMILE (Open Speech and Music Interpretation by

large feature vectors is obtained. The downside is that not

Large Space Extraction) [13]. It is a tool for signal

all of the feature vectors are of good value, especially not

processing and machine learning. We extracted 1582

for emotion recognition. For that reason a feature selection

features in total [14]. The LLDs that openSMILE is using

method is often used.

are computed from basic features (pitch, loudness, voice



quality) or representations of the audio signal (cepstrum,

3 THE APPROACH

linear predictive coding).



Figure 1 shows the whole process of the speech emotion

On these LLDs functionals are applied and static feature

classification used in this research. An emotional speech

vectors are produced, therefore static classifiers can be

database is used, which consists of simulated and annotated

used. The functionals that are applied are: extremes

utterances. Next, feature extraction is performed by using

(position of mix/min value), statistical moments (first to

open source feature extractor. Then, the extracted features

forth), percentiles (ex. the first quartile), duration (ex.

are standardized. After standardization, feature selection

percentage of time the signal is above threshold) and

methods are used for decreasing the number of features and

regression (ex. the offset of a linear approximation of the

selecting only the most relevant ones. Finally, the emotion

contour).

recognition is performed by a classification model.

After the feature extraction the feature vectors are



standardized so the distribution of the values of the feature

Feature Preparation

vectors is with mean equal to 0 and standard deviation equal

Emotional

Feature

Feature

Feature

to 1. Next, a method for feature selection is used. Features

Database

Extraction

Standardization

Selection

are ranked with algorithms for feature ranking and

experiments are done with varying number of top ranked

Emotion

Emotion

features. For ranking the features two different algorithms

Classification

are used, gain ratio [15] and ReliefF [16]. Both algorithms Figure 1: Scheme for speech emotion classification.

are used as they are implemented in Orange software packet



for machine learning and data mining [17].

3.1 Emotional Database





There are several emotional speech databases that are

3.3 Emotion Classification

extensively used in the literature [11]: German, English,



Japanese, Spanish, Chinese, Russian, Dutch etc. One of the

Once the features are extracted, selected and standardized,

main characteristics of an emotional speech database is the

they are used to form the feature vector database. That is a

type of the speech: whether it is simulated or it is extracted

database in which each data sample is an instance, i.e.,

from real life situations. The advantage of having a

feature vector. Additionally, each instance is labeled with the

simulated speech is that the researcher has a complete

emotion. After this the instances are used to train a

control over the emotion that it is expressed and complete

classification model in order to recognize emotions out of a

control over the quality of the audio. However, the

speech data.

disadvantage is that there is loss in the level of naturalness

and spontaneity. On the other hand, the non-simulated

4 EXPERIMENTS



emotional databases consist of a speech that is extracted

Three types of experiments are performed. In the first type,

from real life scenarios like call-centers, interviews,

tests for comparison of three classification algorithms are

meetings, movies, short videos and similar situations where

done. The algorithm with the highest accuracy is further

the naturalness and spontaneity is kept. The disadvantage is

evaluated with 2 evaluation techniques: 10 fold cross-

that in these databases there is not a complete control over



39

validation and Leave-One-Speaker-Out (LOSO) cross-

Figure 3: SVM classification accuracy for 10 fold cross-

validation.

validation with varying number of features.

4.1 Comparison of Classification Algorithms



Three classification algorithms are compared: KNN [18],

Additional analysis of the performance is performed by

SVM [19] and Naïve Bayes [20]. They are used as analyzing the recognition results for each emotion

implemented in the Orange machine learning toolkit. The

individually. The results achieved for the top ranked 750

data is split 70-30, i.e., 70% of the data is used as training,

features are shown in Figure 4. The highest accuracy per

and the remaining 30% is used for testing. Tests are

class is achieved for the class “sadness” (97%). On the

performed with varying number (50, 100, 200, 300, 400,

contrary, the lowest accuracy per class is achieved for the

class “happiness”

500, 750, 1000 and 1582) of top ranked features by gain

(68%).



ratio. The results (shown in Figure 2) show that the SVM has

the highest accuracy, i.e., 91% when the top ranked 500

Accuracy per class value in %

features are used. By using the top ranked 300 features the

97

100

drops to 88%.

90

89

91

90

86

86

87



80

SVM

KNN

Naïve Bayes

68

Accuracy in %

70

95

91

60

88

88

88

90

87

Anger

Boredom

Disgust

Fear

Happiness

Neutral

Sadness

AVG

84

85



80

80

75

Figure 4: SVM accuracy per class for 10 fold cross-

75

validation with top ranked 750 features.

70

65

65



60

4.3 Leave-One-Speaker-Out Cross-Validation

50

100

200

300

400

500

750

1000

1582



If the system for speech emotion recognition is supposed to

Number of features



work in an environment where it does not have any

Figure 2: SVM, KNN and Naïve Bayes classification

information about the speaker, LOSO is the best approach

accuracy for varying number of features.

for testing the accuracy of the system.



The LOSO validation approach means that the train data

4.2 10 Fold Cross-Validation

consists of 9 speakers and the remaining one is used for



We further evaluated the SVM with the 10 fold cross-

testing. This is repeated 10 times, each time using different

validation technique. The results are shown in Figure 3. The

speaker’s data for testing. Figure 5 shows the results that are

highest accuracy of 89% is obtained by using top ranked 750

obtained with the LOSO technique. The testing speaker is

features. By using the top ranked 300 features the average

represented on the x-axis. The varying color represents the

accuracy is 87%, which is significantly high performance

number of top ranked features (by ReliefF) used. The highest

with such a low number of features.

average accuracy of 82% is obtained by using top ranked



1000 features. Also we can see that the accuracy depends

Accuracy in %

mainly from the speaker that is used as test data.

95

87

86

86

89

88

88

For the experiments about the accuracy per class for each of

85

the 7 emotional states, top ranked 1000 features (by ReliefF)

75

76

83

69

are used. The results are shown in Figure 6. The highest

65

50

100

200

300

400

500

750

1000

1582

accuracy per class of 94% was achieved for the class

“sadness” and the lowest accuracy per class of 70% was

Number of features

achieved for the class “fear”.



Accuracy in %

Number of features:

300

500

750

1000

1582

100

90

80

70

60

S1

S2

S3

S4

S5

S6

S7

S8

S9

S10

AVG



Figure 5: SVM classification accuracy for LOSO with varying number of features



40

Acc

A uracy

y per



cl

ass v

s al

v ue in %

[4] R. Horlings, D. Datcu, L. J. M. Rothkrantz. Emotion

100

recognition

using

brain

activity.

Proceeding

94

CompSysTech '08 Proceedings of the 9th International

90

85

85

82

Conference on Computer Systems and Technologies and

80

81

78

80

Workshop for PhD Students in Computing, 2008.

70

70

[5] A. Metallinou, S. Lee, S. Narayanan. Audio-Visual

Emotion Recognition Using Gaussian Mixture Models

60

Anger

Boredom

Disgust

Fear

Happiness Neutral

Sadness

Average

for Face and Voice. Multimedia. 2008. ISM 2008. IEEE



Figure 6: SVM accuracy per class value for Leave-One-

International Symposium on Multimedia, 2008.

Speaker-Out cross-validation with top ranked 1000

[6] P. Ekman. Emotions in the Human Faces. 1982.

features.

[7] James A. Russell. A circumplex model of affect. 1980.



[8] P. N. Juslin, K. R. Scherer. Vocal expression of affect.

5 CONCLUSION

In J. A. Harrigan, R. Rosenthal, & K. R. Scherer (Eds.).



The new handbook of methods in nonverbal behavior

The results showed that SVM outperforms the KNN and

research, pp. 65-135, 2004.

Naïve Bayes. By using the top ranked 500 features by gain

[9] K. R. Scherer. Vocal communication of emotion: A

ratio, SVM achieved the highest accuracy of 91%.

review of research paradigms. Speech Communication

In addition, the 10 fold cross-validation of the SVM showed

40: 227–256. 2003

that highest accuracy of 89% was achieved by using the top

[10] M. E. Mena. Emotion Recognition From Speech

750 ranked features. By using the top 300 ranked features

Signals, 2012.

the accuracy was 87%. This is the so-called “knee” on the

[11] D. Ververidis, C. Kotropoulos. A review of emotional

graph, which represents the best tradeoff between the

speech databases. In: PCI 2003. 9th Panhellenic

number of features and the achieved performance.

Conference on Informatics., pp. 560–574, 2003.

Regarding the accuracy for each of the 7 emotions,

[12] F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier,

experiments were performed with the top ranked 750

B. Weiss. A Database of German Emotional Speech.

features by gain ratio. The best recognized emotion was the

2005. In: Proc. Interspeech. pp. 1517–1520.

“sadness”, with 97%; and the worst recognized emotion was

[13] F. Eyben, M. Wöllmer, B. Schuller. openSMILE - The

the “happiness” with 68% accuracy.

Munich Versatile and Fast Open-Source Audio Feature

With LOSO cross-validation, the SVM achieved highest

Extractor. 2010.

accuracy of 82% by using the top 1000 ranked features. By

[14] F. Eyben, F. Weninger, M. Wollmer, Bjorn Schuller.

using the top 500 ranked features the accuracy was 80%.

openSmile Documentation. Version 2.0.0., 2013.

Regarding the accuracy per emotion, experiments were

[15] H. Deng, G. Runger, E. Tuv. Bias of importance

performed with the top ranked 1000 features. The highest

measures for multi-valued attributes and solutions.

accuracy per class (emotion) of 94% was achieved for the

Proceedings of the 21st International Conference on

class “sadness” and the lowest for the class “fear” 70%.

Artificial Neural Networks (ICANN2011). 2011

The results showed that the classifier achieves better

[16] I.

Kononenko,

E.

Simec,

M.

Robnik-Sikonja.

accuracy with the 10 fold cross-validation technique

Overcoming

the

myopia

of inductive learning

compared to the LOSO validation technique. The reason for

algorithms with RELIEFF. Applied Intelligence,

this is that with the 10 fold cross-validation the training and

Forthcoming.

the testing data usually contain data samples of the same

[17] J. Demšar, B. Zupan. Orange: From experimental

speaker. This is not the case if the system is intended to be

machine learning to interactive data mining. White

used in real life for users not known in advance. However, a

Paper (http://www.ailab.si/orange). Faculty of Computer

hybrid approach that includes a calibration phase at the

and Information Science. University of Ljubljana.

beginning (for example asking the user to record several data

[18] D. Aha, D. Kibler. Instance-based learning algorithms.

samples) is considered for future work.

1991, Machine Learning. 6:37-66.



[19] N. Cristianini, J. Shawe-Taylor. An Introduction to

References

Support Vector Machines and other kernel-based



learning methods. Cambridge University Press, 2000.

[1] D. G. Myers. Theories of Emotion. Psychology: Seventh

[20] R. Stuart, N. Peter. Artificial Intelligence: A Modern

Edition. New York NY: Worth Publishers. 2004.

Approach. Second Edition, Prentice Hall.

[2] V. Perez-Rosas, R. Mihalcea. Sentiment Analysis of

Online Spoken Reviews. Interspeech, 2013.

[3] A. Halder, A. Konar, R. Mandal, A. Chakraborty.

General and Interval Type-2 Fuzzy Face-Space

Approach to Emotion Recognition. IEEE Transactions

on Systems, Man, and Cybernetics,43 (3), 2013.



41

QUALCOMM TRICORDER XPRIZE

FINAL ROUND: A REVIEW



Anton Gradišek, Maja Somrak, Mitja Luštrek, Matjaž Gams

Department of Intelligent Systems and Solid State Physics Department

Jožef Stefan Institute

Jamova cesta 39, 1000 Ljubljana, Slovenia

Tel: +386 1 4773967

e-mail: anton.gradisek@ijs.si





ABSTRACT

conditions, independent of professional health care



personnel. Being able to diagnose common medical

The Qualcomm Tricorder XPRIZE competition began

conditions at home benefits both the users by directing them

in January 2012, with the goal of developing a mobile

to see the doctor if needed and the healthcare system itself,

device to monitor health parameters and quickly

by reducing the costs and waiting times at medical centers.

diagnose several common medical conditions. In August

To be precise, there are two criteria in the competition: i) to

2014, a list of ten finalists was announced, including a

continuously monitor key health metrics (blood pressure,

Slovenian team MESI Simplifying diagnostics that

respiratory rate, heart rate, temperature and the oxygen

brings together companies MESI, D· Labs, and

saturation – SpO2), and ii) to diagnose a set of 13 pre-

Gigodesign, and partners from academia, Jožef Stefan

selected (core set) health conditions (Anemia, Atrial

Institute and Faculties of Electrotechnics and Medicine

Fibrillation, Chronic Obstructive Pulmonary Disease

of the University of Ljubljana. In this review, we

(COPD), Diabetes, Hepatitis A, Leukocytosis, Pneumonia,

present the XPRIZE competition, we briefly look at the

Otitis Media, Sleep Apnea, Stroke, Tuberculosis, Urinary

ten finalists and more closely at the MESI Simplifying

Tract Infection, Absence of condition) and three other

diagnostics approach. Special attention is given to the

conditions from an additional set (Airborne Allergens,

diagnostic algorithm that was developed in order to

Cholesterol Screen, Food-borne Illness, HIV, Hypertension,

facilitate the diagnostic process.

Hypo- and Hyperthyroidism, Melanoma, Mononucleosis,

1 INTRODUCTION

Osteoporosis,

Pertussis,

Shingles,

Strep

Throat).

Furthermore, the consumer experience represented an

XPRIZE, formerly known as the X Prize Foundation, is a

important component of the qualifying round evaluation

non-profit organization that was established in order to

criteria.

stimulate innovation for the benefit of humanity through



incentivized competition. The challenges are “audacious,

Around 300 teams from all over the world entered the

but achievable, tied to objective, measurable goals” [1]. The

competition, with 34 teams reaching the qualifying round. In

first prize from the foundation was the Ansari XPRIZE that

August 2014, ten teams were chosen for the final round of

offered a US$10 million prize for the first non-government

the competition that will include testing the products on real

organization to launch a reusable manned spacecraft into

patients during the summer of 2015. The winners of the

space twice within two weeks. The prize was won by an

competition will be announced in January 2016.

aerospace company Scaled Composites with their

In this review paper, we present the ten finalists and their

SpaceShipOne [2]. In the following years, other XPRIZEs

approaches, based on the information made public so far.

were announced, such as Google Lunar XPRIZE that

Some teams unveiled several details about their products

focuses on launching and landing a robotic spacecraft on

while others are more secretive. We pay special attention to

the Moon, with sending data back to Earth.

the MESI Simplifying diagnostics team approach and the



diagnostic algorithm.

The Qualcomm Tricorder XPRIZE [3] was launched in



January 2012. The name was inspired by the science-fiction

TV series Star Trek, where “tricorder” was a device that

2 FINALIST TEAMS

immediately diagnosed medical conditions of the patients.

Among ten finalists, there are four teams from the United

The sponsor of the prize is Qualcomm, an American

States, two from the United Kingdom, and one from each

semiconductor

company

that

focuses

on

wireless

Taiwan, Canada, India, and Slovenia. The teams are

telecommunications

technologies.

The

aim

of

the

presented as on the XPRIZE website, except for MESI

competition is to revolutionize the healthcare system by

Simplifying diagnostics that is presented separately later on.

developing an instrument capable of measuring some key

health parameters and diagnosing a set of common medical



42

Aezon [4] is a team of student engineers from Johns

Final Frontier Medical Devices [9] is a team from Paoli,

Hopkins University in Baltimore, Maryland (US), with

Pennsylvania (US), connected to Basil Leaf Technologies.

several partners from the industry. Their solution consists of

They are developing a device called DxtER, which relies on

four components, each being developed by or in partnership

algorithms developed by medical experience as well as on

with a specialized company. The vital signs monitoring unit

actual patient charts. Concept art for the product indicates

is designed to wrap around the neck, like a neck support

the device is roughly spherically shaped with integrated

pillow, and is being designed by a startup company Aegle.

sensors.

The diagnostic module is exploiting microfluidic chip



technology and qPCR to test for the presence of pathogens

Scanadu [10] is a team from Moffett Field, California (US).

and is being developed in partnership with Biomeme. The

The team’s product is called Scanadu Scout, which is a disk-

data is processed by a smartphone app that also uses

shaped device that contains sensors for temperature, hearth

algorithms to direct users towards relevant tests. In addition,

rate, and blood pressure. The disk is to be held between the

the phone uses software for spirometry, developed by

thumb and index finger and placed on the forehead. The data

SpiroSmart. The data is stored on a cloud where an API uses

is transferred to a smartphone and processed there. No

big data to help turn user reported symptoms into diagnostic

technical specifications are known yet, neither is the

solutions. The team also participated in an Indiegogo

approach for the diagnostic module. Scanadu ran an

campaign where they raised around 5000 US$.

Indiegogo campaign from May to July 2013 and managed to

raise over US$ 1.6 million. The campaign has also received

CloudDX [5] is a Canadian team, associated with the

considerable media coverage.

company Biosign Technologies, a manufacturer of medical



devices. The vital signs unit is placed around the neck; it

SCANurse [11] is a team from London, UK. Their system

uses two electrodes at upper chest area to monitor ECG and

consists of blood, vitals, breath, and image units. No specific

an ear bud with an infrared temperature sensor to measure

information was provided on their website at the time of

body temperature. An ear clip uses photoplethysmograph to

writing.

monitor breathing and heart rate. Blood pressure is measured



by a wrist monitor with the pulse transit time approach. The

Zensor [12] is a team from Belfast, UK, connected to

diagnostic module is designed to analyze saliva, blood, and

Intelesens Responsive Healthcare, a company working on

urine. The team is working with industrial partners to

non-invasive vital signs monitoring. Their prototype can

consolidate multiple tests onto one multi-strip cassette. In

detect 3-lead ECG, respiration rate, temperature, and

addition, an application was developed to accept data from

motion. SpO2 sensor is being developed. To diagnose

fitness devices and to integrate them into the system.

medical conditions, urine and blood analysis is included,

Danvantri [6] is a team from Chennai, India, associated

although the details have not been made public yet.

with American Megatrends. The main component of their



product is a handheld health monitor that features a 3-Lead

3 MESI SIMPLIFYING DIAGNOSTICS

ECG electrode to measure ECG signals from the finger,

pulse oximeter, an infrared temperature sensor, camera, 3-

MESI Simplifying diagnostics [13] is a team from

axis accelerometer for monitoring physical activity and a

Ljubljana, Slovenia. The team consists of partners from the

glucometer strip attachment node. Additional devices

industry and the academia. The team is led by MESI, a start-

include a wireless spirometer, neckband ECG/EEG meter,

up company that specializes in development of medical

otoscope, and urine sample analyzer. The data is processed

devices. Their flagship product is an ankle-brachial index

and visualized either on a smartphone or on a tablet.

measuring device (ABPI MD) for the detection of peripheral

arterial disease. Company D·Labs is responsible for a

DMI [7] is a team from Boston-Cambridge, Massachusetts

mobile app and API while Gigodesign focuses on improving

(US), connected to the DNA Medicine Institute. They

the user experience and industrial design. Partners from

developed the rHEALTH Sensor which is a device that

academia come from Jožef Stefan Institute (Department of

employs fluorescence detection optics, microfluidics, and

Intelligent Systems) and two faculties of University of

nanostrip reagents to perform a suite of hematology,

Ljubljana, Faculty of Electrotechnics and Faculty of

chemistry, and biomarker assays from blood. The device

Medicine. Academic partners are responsible for the

was developed in collaboration with NASA to monitor

development of algorithms and for expert medical

astronaut health.

knowledge. The system consists of several modules [14]. A



bracelet monitors activity and three vital signs, ECG, SpO2,

Dynamical Biomarkers Group [8] is a team from Taiwan.

and temperature. A “shield” module is placed on the upper

Their system consists of five components: Smart Vital-

arm and consists of a wireless cuff for blood pressure

Sense-Patch and Smart Vital-Sense-Wrist module; Smart

measurements. It also contains a patch located on the chest

Blood Sense module; Smart Scope module; Smart Exhaler

to measure SpO2, temperature, ECG, respiratory rate, and

module; and Smart Urine Sense module. The modules are

activity tracking. Data obtained from the bracelet and the

connected to a smartphone app that runs algorithms based on

shield module fulfill the vital signs monitoring requirement

proprietary algorithms to conduct a diagnosis.

of the competition.



43



The diagnosis of medical conditions is performed with the

4 DIAGNOSTIC ALGORITHM

help of a smartphone app and aims to recognize all

The diagnostic algorithm was developed at the Department

conditions from the core set, together with Hypertension,

of Intelligent Systems of Jožef Stefan Institute by Maja

Melanoma, and Strep Throat. The user, that has already

Somrak, Mitja Luštrek, Matjaž Gams, and the author of this

performed the vital signs measurements, indicates his

review [15]. The aim of the algorithm is to predict the

concern: “I feel pain” or “I feel unwell”. If “pain” is chosen,

medical condition of the patient, based on the symptoms that

the user specifies the type of the pain on a schematic human

he or she experiences. Around 60 different symptoms are

figure (such as “chest pain”). Based on vital signs data and

taken into account. The problem is highly non-trivial. There

the type of pain/feeling unwell, the algorithm generates a list

is no simple function that would map the domain of a group

of possible symptoms that the user may experience. This list

of symptoms to a codomain containing a single disease.

is generated to include both the symptoms that the user most

People with the same medical condition may experience

probably experiences at the time and would probably want

different symptoms, for example, people with Otitis Media

to report, and also the most relevant symptoms that would

may or may not experience a headache or a discharge from

help the physician or the diagnostic method set a reliable

the ear. An individual symptom is usually typical for several

diagnosis. Based on the chosen symptoms, the algorithm

different diseases. For example, elevated temperature is

then asks for a couple of additional symptoms in order to

typically exhibited in cases of Tuberculosis, Pneumonia,

narrow down the diagnosis and direct the user to one or

Strep Throat, Otitis Media, and others. On the other hand,

more specialized modules that confirm or reject the

even healthy people (“absence of conditions”) often

suggested diagnosis. There are four specialized modules. A

experience some symptoms due to reasons that are not

module “To see” includes a camera which is used to

connected to diseases. Fatigue may be related to a lack of

diagnose Melanoma and Strep Throat. Using a special

sleep while high blood pressure may be a consequence of

camera is advantageous to using the integrated camera in a

drinking caffeinated drinks. In addition, asking the patient

smartphone since the specifications of phone cameras may

for all symptoms on the list is not considered user-friendly,

vary from a model to a model. In addition, light conditions

therefore the goal is to diagnose the medical condition as

are easier to control with a dedicated module. A module “To

accurately as possible using as small number of questions as

hear” includes microphones that are used to monitor

possible. In order to achieve the best performance, the

breathing – in order to detect pulmonary diseases. This

algorithm combines expert medical knowledge and methods

module also allows user to perform a spirometry which is

of artificial intelligence. At this point, we only aim to

used to diagnose COPD. The urine module, “Pee”, performs

diagnose the diseases of patients with a single medical

urine analysis using test strips and a camera that reads the

condition. Diagnosing a combination of more than one

test results. The fourth module is called “Blood” and is

disease for a single patient is a next-level problem.

intended for blood tests. In order to achieve best user



experience, this module should rely on non-invasive

As discussed above, the initial input for the algorithm comes

methods, such as spectroscopy, although it is more likely

from the vital signs measurements (symptoms such as

that a drop of blood will be required for analysis in the final

elevated temperature or high blood pressure) and from the

version. This module is intended for detection of diabetes

pain symptoms that the user chooses. Additionally, for

and anemia.

personalized tests, the algorithm may also include identified

risk factors for a particular user (from the algorithm point of

view, we also treat the risk factors as “symptoms”). For

example, smokers and older people are more likely to

develop COPD than non-smokers, people with a high BMI

have higher risks for diabetes, etc. All these are called the

“initial symptoms”. The additional list of suggested

symptoms is generated using association rules (ARs) and the

minimum-Redundancy-Maximum-Relevance

(mRMR)

method. The ARs (symptom A –> symptom B) are used to

produce a set of probable additional symptoms. The goal of

the mRMR method is to select symptoms that are as

mutually dissimilar as possible and at the same time as

indicative of the medical condition as possible. In other

words, the algorithm tries to avoid asking the user about

several similar symptoms and at the same time ask about



symptoms that cover all spectrum of probable medical

Figure 1: MESI Simplifying diagnostics system: a bracelet,

smartphone app, and four diagnostic modules – To see, To

conditions.

hear, Blood, and Pee. The Shield module is not shown here,



it comes in form of a sleeve with attachable electrodes.

In the following step, the information gathered up to this

point is used for actual disease prediction. The probabilities



44

for the 15 medical conditions are evaluated using a set of

patient to use a specialized module which confirms or rejects

J48 classifiers, one for each of the conditions. There are two

the prediction.

probability thresholds: conditions above the high threshold



are considered very probable and conditions below the low

An overview of the algorithm, developed at Jožef Stefan

threshold are considered unlikely. The area between the two

Institute, is presented. The algorithm combines expert

thresholds is a so-called “gray zone” where we do not have

medical knowledge with methods of artificial intelligence

enough information to make a reliable claim whether the

and machine-learning. The aim of the algorithm is to make

medical condition is present or not. The diagnostic

an accurate prediction of diagnosis with a small number of

procedure terminates when all conditions from the list are

questions, to improve the user experience. We outline the

either above the high or below the low threshold. If one or

challenges of the task. Testing of the algorithm on real

more condition remain in the gray zone, at least one

patient data is currently underway and the results will be

additional question (symptom) is required for a confident

published later.

prediction. The additional symptom is chosen according to



the highest information gain (IG) that an individual

Reference

symptom would bring.



[1] http://www.xprize.org/



[2] http://space.xprize.org/ansari-x-prize

Calculation of the IG and mRMR values, searching for ARs,

[3] http://www.qualcommtricorderxprize.org/

and building the J48 classifiers is based on two types of data

[4] http://www.aezonhealth.com/

– real and simulated patient data. Real patient data was

[5] http://www.clouddx.com/

collected either with both patients with medical conditions

[6] http://www.vitalsplus.com/

and healthy individuals filling in a questionnaire about the

[7] http://www.dnamedinstitute.com/

symptoms they experience (a complete set of symptoms), or

[8] http://dbg.ncu.edu.tw/

by medical doctors retroactively filling in the symptom

[9] http://www.basilleaftech.com/

tables for real patients. The simulated dataset was build

[10] https://www.scanadu.com/

using expert medical knowledge. Physicians prepared a table

[11] http://www.scanurse.com

of probabilities for patients suffering from each of the

[12] http://www.intelesens.com

medical conditions to exhibit each of the symptoms from the

[13] http://www.simplifyingdiagnostics.com/

list, based on their professional experiences. Using this

[14] M. Somrak, M. Luštrek, J. Šušterič, T. Krivc, A.

table, it is possible to generate millions of distinct “virtual

Mlinar, T. Travnik, L. Stepan, M. Mavsar, M. Gams:

patients”. Initial tests using only simulated data showed high

Tricorder: Consumer Medical Device for Discovering

sensitivity and specificity for disease diagnostics [15]. Tests

Common Medical Conditions, Informatica 38 (2014)

using a combination of real and simulated data are currently

81–88.

underway.

[15] M. Somrak, A. Gradišek, M. Luštrek, A. Mlinar, M.



Sok, M. Gams: Medical diagnostics based on

combination of sensor and user-provided data. AI-

5 CONCLUSIONS

AM/NetMed 2014, Artificial Intelligence and Assistive

We present an overview of the Qualcomm Tricorder

Medicine: proceedings of the 3rd International

XPRIZE competition and the teams that reached the final

Workshop on Artificial Intelligence and Assistive

round, with a special focus on the Slovenian team entry. The

Medicine co-located with the 21st European Conference

approaches of many teams are similar to some degree. The

on Artificial Intelligence (ECAI 2014), Prague, Czech

most common approach is to use of single a device with a

Republic, pp. 36-40

number of integrated sensors to monitor vital signs (the first



competition task). The second task, the diagnosis of medical

conditions, is typically achieved using a series of dedicated

additional modules. Some teams rely strongly on detection

of biomarkers in body fluids while others also incorporate

technologies such as spirometry and image-processing

algorithms. Several teams mention they use algorithms for

diagnostics, although not much has been revealed to the

public so far.



The MESI Simplifying diagnostics approach consists of a

bracelet and a “Shield” module to monitor vital signs. The

diagnosis of medical conditions is obtained using an

algorithm that runs on a mobile device. The algorithm uses

the vital signs data and the symptoms entered by the patient

to predict a possible medical condition and to direct the



45

AVTOMATIZACIJA IZGRADNJE BAZE ODGOVOROV

VIRTUALNEGA ASISTENTA ZA OB ČINE

Leon Noe Jovan, Svetlana Nikić, Damjan Kužnar, Matjaž Gams

Odsek za inteligentne sisteme, Institut “Jožef Stefan”, Jamova cesta 39, 1000 Ljubljana

e-mail: leon.jovan@gmail.com

POVZETEK

vprašanja je zelo zamudno, kar lahko vpliva na to, koliko

občin bo sodelovalo pri projektu. Zato se pojavlja potreba po

Asistent je inteligentni virtualni pomočnik, ki odgo-

avtomatizirani rešitvi, ki bi ustvarila odgovore za posamezno

varja na vprašanja, postavljena v naravnem jeziku in je

občino za celotno zlato osnovo.

sposoben poiskati odgovore na spletnih straneh. Cilj pro-

Osnovna ideja naše rešitve je, da poskušamo čimbolj av-

jekta Asistent je vzpostavitev spletne storitve za izdelavo

tomatizirati vnašanje podatkov v bazo asistenta.

S klasi-

in urejanje prilagojenega virtualnega asistenta, ki si ga

fikacijo želimo določiti, na kateri strani spletni strani občine

bodo lahko občine namestile na svoje spletne strani in

se nahaja podatek, ki ga zahteva posamezno pravilo.

S

tako obiskovalcem olajšale iskanje informacij, ki jih stran

kratkimi skriptami pa želimo nato iz spletne strani pridobiti

ponuja. Ta prispevek opisuje postopke avtomatizacije iz-

podatek ter ga prikazati v kratkem, uporabniku prijaznem

gradnje baze asistentovih odgovorov z uporabo različnih

odgovoru.

pristopov strojnega učenja ter ekstrakcije informacij iz

Na koncu projekta bi tako vsaka občina lahko imela sebi

spletnih strani.

Ta postopek bo olajšal delo občin pri

prilagojenega asistenta, s čimer bi vsi občani dobili možnost

uvajanju asistenta, kar predvsem vpliva na razširjenost

naravnega poizvedovanja in komuniciranja z občinami.

uporabe na spletnih straneh občin.

Opisana je os-

novna ideja arhiterkture sistema za avtomatizacijo grad-

Problem je soroden ekstrakciji informacij (angl. Informa-

nje odgovorov virtualnega asistenta, predvsem pa so pred-

tion extraction) iz HTML dokumentov [4]. Svetovni splet je

stavljeni pristopi za generiranje odgovorov in ekstrakcijo

zbirka velike količine dokumentov, vendar pa podatki niso na-

podatkov iz spletnih strani.

jbolje strukturirani. Naša naloga je, da iz takšnih nestrukturi-

ranih podatkov najdemo podatke, ki so za nas uporabni.

V zadnjem času je bilo predlaganih več različnih pristopov

1

UVOD

za ekstraktcijo informacij iz spleta.

Pristopi vključujejo

Večina slovenskih spletnih strani je omejena z uporabo

uporabo strojnega učenja, iskanja vzorcev in podobno z ra-

starejših spletnih tehnologij, kar povzroča oteženo iskanje po

zličnimi stopnjami avtomatizacije [5].

njih. Splošni iskalniki v povprečju najdejo le med 10% do

30% ustreznih odgovorov [13]. Ena izmed možnih rešitev je

2

PROJEKT ASISTENT

inteligentni virtualni pomočnik oz. asistent, ki zna odgov-

oriti na vprašanja v naravnem jeziku. Asistenti se pojavljajo

Celoten proces izgradnje baze odgovorov je sestavljen iz treh

kot pomoč pri iskanju po spletnih straneh, pametnih telefonih,

korakov, in sicer pridobivanje in priprava podatkov, klasi-

itd. Ena najbolj poznanih asistentk na svetu je npr. Siri, ki jo

fikacija in generiranje odgovorov.

najdemo na novejših sistemih iOS podjetja Apple Inc. in se jo

Najprej moramo podatke pridobiti in jih pripraviti za

lahko uporablja v več svetovnih jezikih. Prva virtualna asis-

nadaljno obdelavo. Podatke pridobimo s spletnim pajkom,

tentka v slovenščini pa je Vida, ki je nastala kot pomoč pri

ki obišče vse spletne strani občine in jih v obliki HTML

iskanju po straneh DURSa.

dokumenta shrani v interno bazo podatkov. Iz teh datotek

Cilj celotnega projekta [12] je ustvariti virtualne asistente

nato izluščimo celotno besedilo, ga lematiziramo in označimo

za slovenske občine, ki bi bili potem lahko dostopni preko

besede z označevalnikom, saj bomo te podatke uporabili v

njihovih spletnih strani.

naslednjih korakih. Ta dva postopka naredimo z lematizator-

Vsaka baza znanja za neko občino je sestavljena iz vnosov,

jem LemmaGen [9] in Oblikoslovni označevalnik za sloven-

vsak vnos pa vsebuje vprašanje, imenovano pravilo, in

ski jezik [7]. Spletnega pajka za shranjevanje spletnih strani

odgovor, pri čemer so pravila ključne besede vprašanj, ki jih

smo izdelali z uporabo Java knjižnjice Jsoup [6].

zastavljajo uporabniki. Vsak asistent občine ima približno

Naslednji korak je klasifikacija, kjer moramo pridobljene

500 pravil, ki so enaka za vse občine, odgovore pa je potrebno

spletne strani razvrstiti med skoraj 500 vnosov. Za klasi-

kreirati za vsako posamezno občino. Tem pravilom pravimo

fikacijo uporabimo lematizirana besedila spletnih strani, ki

“zlata osnova”. Zlata osnova je bila oblikovana iz pravil in

smo jih predstavili kot vrečo besed [2], z mero TF-IDF [3] pa

odgovorov, ki so jih določene občine ročno vnesle v svoje

izberemo le najbolj pomembne besede, saj je preveč različnih

asistente na začetku projekta. Ročno vnašanje odgovorov na

besed, da bi obravnavali vse.

46



Figure 1: Arhitektura sistema

Za generiranje odgovora za določen vnos imamo torej

http://www.uradni-list.si/1/search?smode=ul&

na voljo spletno stran, ki smo jo uvrstili, da vsebuje po-

cmd=search&q=iskalni_niz&rubm=s&

datke primerne temu pravilu, poleg pa tudi lematizirano in

selectItem=id_obcine&rublist%5B%5D=

označeno besedilo te strani, včasih pa tudi katere od drugih

id_obcine

virov (Wikipedia, Uradni list, ...). Iz teh podatkov torej gener-

iramo kratek odgovor, ki ga prikaže virtualni asistent.

Iskalni niz smo določili glede na pravilo, ID številko

občine pa smo prebrali iz spletne strani tako, da smo poiskali

ustrezen HTML element, ki se je nahajal med možnostmi za

3

AVTOMATIZACIJA IZGRADNJE BAZE ODGOV-

filtriranje in je vseboval ime občine ter ID za filtriranje po

OROV

tej občini. Za pravilno delovanje smo vse šumnike pretvo-

Izdelali smo program, ki je zmožen zgraditi bazo odgovorov

rili s Percent-encoding [8], ki se uporablja za kodiranje ne-

471 vprašanj, ki so skupna vsem občinam (“zlata osnova”).

standardnih znakov v URL naslovih. Rezultate iskanja smo

Sistem za generiranje odgovorov je sestavljen iz 471 razre-

nato prebrali iz vrnjene spletne strani. Če je vrnjen rezultat

dov, izdelanih v programskem jeziku Java, ki so poimeno-

samo eden, smo povezavo do tega elementa dodali v odgovor,

vani po ključu Rule < IDpravila > glede na pravilo, za

če pa je rezultatov iskanja več, se v odgovor dodajo prvi trije

katerega generirajo odgovor. Razred, ki generira odgovor,

dokumenti s pripisom, da se več dokumentov nahaja na strani

potrebuje le spletno stran, iz katere črpa informacije, v neka-

v ozadju.

terih primerih pa tudi lematizirano in označeno besedilo te

spletne strani.

Osnoven razred je RuleClass, ki omogoča

3.2

Povzetki

branje vseh potrebnih podatkov ter vračanje odgovora. Izde-

lali smo hierarhijo razredov, ki dedujejo in so nadgradnja

Vprašanja so v nekaterih primerih precej splošna in običajno

RuleClass razreda, omogočajo pa generiranje odgovorov, ki

zahtevajo daljši odgovor, ki ni primeren za asistentov

so bolj specifični in rešujejo določen problem, ki se pogosto

odgovor. Zato asistent odgovori s krajšim povzetkom, več

pojavlja. Takšni problemi so pridobivanje obrazcev, kontak-

o tem pa si uporabnik lahko prebere na strani, ki se mu

tov, imen oseb, povzetkov iz daljšega besedila in povezav.

prikaže v ozadju. Takšna vprašanja so na primer opisi kul-

Pristopi za reševanje takšnih problemov so opisani v nadal-

turnih znamenitosti, kmetijstva, predstavitev grba ter zastave

jevanju.

in podobno.

Algoritem deluje na principu ključnih besed in regularnih

3.1

Vloge in obrazci

izrazov, ki jih predhodno podamo za posamezno pravilo. Po-

damo lahko seznam ključnih besed, ki jih algoritem pretvori

Za pridobitev obrazcev, ki so specifični za vsako občino,

v regularne izraze. Potem iz celotnega HTML dokumenta

smo uporabili spletno stran Uradni list [11], ki objavlja za-

rekurzivno odstranimo vse vrstične elemente, njihovo vsebino

kone, predpise in druge javne objave v Republiki Sloveniji.

pa dodamo staršu tega elementa. Ta postopek nam omogoča

S pomočjo iskalnika na spletni strani računalnik pridobi

lažje določanje besedilnih enot, saj nam vmesni vrstični ele-

določen obrazec, ki ga potem posreduje uporabniku v obliki

menti ne delijo enega odstavka na več delov. Po končanem

odgovora.

postopku odstranjevanja vrstičnih elementov, algoritem pre-

Do dokumentov na spletni strani Uradni list smo dostopali

gleda vsa vozlišča z besedilom (TextNode) in prešteje, ko-

tako, da smo naredili zahtevek z ustreznimi GET parametri,

likokrat se elementi iz seznama regularnih izrazov pojavljajo

ki od strani zahtevajo, da nam poišče določene dokumente.

v besedilu. Na podlagi te ocene algoritem izbere najboljši

47

odstavek in ga prikaže kot odgovor.

dobro deluje generiranje odgovorov.

Za učni primer smo vzeli občino Pivka in njene spletne

strani ter vnose, saj ima najbolj popolne odgovore. Kvaliteto

3.3

Kontakti

ustvarjenih odgovorov smo preizkusili na podlagi 16 drugih

Nekateri kontakti so na voljo samo na straneh občine in jih je

občin. Ročno smo pregledali vse vnose in strani, katere so

potrebno pridobiti neodvisno od oblike strani. Sem spadajo

vpisale občine, in izločili tiste, ki niso bile primerne. Tako

na primer kontakt direktorja občine, svetovalcev in podobno.

smo izločili vpliv napačne klasifikacije na kvaliteto ustvar-

Kontakt je v osnovi sestavljen iz imena, naziva, telefonske

jenih odgovorov. S tem smo pri nekaterih občinah močno

številke ter elektronskega poštnega naslova oziroma kombi-

zmanjšali število pravil, saj je bilo veliko podanih spletnih

nacije le-teh. Najprej je bilo tako potrebno iz besedila pre-

strani oziroma odgovorov napačnih ali pa so bili prepisani od

poznati te elemente.

neke druge občine. Rezultat generiranja odgovorov smo pre-

Za prepoznavanje imen smo naredili bazo imen, pri čemer

gledali ročno, jih primerjali z ročno vpisanimi odgovori, ki so

smo za osnovo vzeli podatke z Wikipedie [1]. V besedilu smo

jih izdelale občine, ter izračunali delež ustreznih odgovorov.

nato lahko imena preprosto našli s pregledom baze.

Odgovor je bil ocenjen kot ustrezen, če je vseboval iskane po-

datke v uporabniku prijazni obliki. Sem torej niso šteti odgov-

Za iskanje telefonskih številk smo si pomagali s knjižnico

ori, ki pozovejo uporabnika, naj si ogleda stran v ozadju.

Libphonenumber [10], ki najde v podanem besedilu vse tele-

Rezultate ocenjevanja ustreznosti ustvarjenih odgovorov

fonske številke neke države.

prikazuje spodnja tabela 4. Prvi stolpec predstavlja občino,

Elektronske naslove smo iskali s pomočjo regularnega

za katero smo preverjali rezultate, drugi število vseh pravil, ki

izraza:

jih je občina vpisala in tretji število pravil, ki imajo pravilne

ali vsaj delno pravilne odgovore. Zadnja dva stolpca pred-

[_A-Za-z0-9-\\+]+(\\.[_A-Za-z0-9-]+)*@[A-Za-

z0-9-]+(\\.[A-Za-z0-9]+)

stavljata rezultate generiranja odgovorov.

*(\\.[A-Za-z

]{2,})

Občina

Vnešena

Ustrezna

Ustrezni

Neustrezni

pravila

pravila

odgovori

odgovori

Če si spletno stran predstavljamo kot drevesno strukturo

Koper

343

311 (91%)

235 (76%)

76 (24%)

HTML elementov, je ta algoritem iskal v tem drevesu najnižje

Velenje

388

347 (89%)

269 (78%)

78 (22%)

ležeče vozlišče, ki vsebuje vse te elemente - ime, naziv (po-

Hrastnik

399

363 (91%)

268 (74%)

95 (26%)

Hrpelje - Kozina

457

395 (86%)

314 (79%)

81 (21%)

dane ključne besede), telefonsko številko in/ali poštni naslov.

Kidričevo

209

187 (89%)

165 (88%)

22 (12%)

Da bi bil odgovor v uporabniku prijazni obliki, smo odstranili

Krško

456

375 (82%)

297 (79%)

78 (21%)

še razne stile in vrnili HTML iz vozlišča kot odgovor.

Log - Dragomer

294

244 (83%)

206 (84%)

38 (16%)

Mengeš

338

270 (80%)

230 (85%)

40 (15%)

Podvelka

202

193 (96%)

170 (88%)

23 (12%)

Vodice

132

128 (97%)

113 (88%)

15 (12%)

3.4

Osebe

Vransko

257

235 (91%)

202 (86%)

33 (14%)

Celje

452

431 (95%)

350 (81%)

81 (19%)

Nekatera pravila zahtevajo prepoznavanje oseb v besedilu

Litija

456

264 (58%)

228 (86%)

36 (14%)

in njihove vloge. Takšno pravilo je na primer ugotavljanje

Šentjur

373

330 (88%)

247 (75%)

83 (25%)

ˇ

župana ali podžupanov občine. Za ta postopek smo upora-

Zalec

448

425 (95%)

346 (81%)

79 (19%)

Trbovlje

286

258 (90%)

212 (82%)

46 (18%)

bili Oblikoslovni označevalnik za slovenski jezik [7], ki nam

Povprečje

343

88%

82%

18%

določi sklon, spol, število, besedno vrsto ter ugotovi ali gre za

obče ali lastno ime.

Vsi rezultati so bili narejeni na podlagi ročne klasifikacije.

Župana na spletni strani smo prepoznali tako, da iščemo

Nadaljnje delo tako vsebuje še klasifikacijo spletnih strani os-

vsa zaporedja besed, dolga vsaj dve besedi, ki so sestavljena

talih občin. Ker klasifikacija ne bo delovala popolno, lahko

samo iz samostalnikov v ednini in so lastna imena. Ta za-

pričakujemo nekoliko slabše rezultate.

poredja besed se lahko začnejo tudi z besedami mag., dr., ali

Ustreznih generiranih odgovorov je 82% odstotkov, kar

pa župan v našem primeru. Vsa ta zaporedja besed še prever-

ocenjujemo za zadovoljiv rezultat, ki olajša človekovo delo.

imo, če vsebujejo eno od osebnih imen, ki smo jih pridobili na

Višji odstotek so vrnili odgovori tistih občin, ki so imele manj

wikipediji. Tako predpostavimo, da gre res za ime osebe. Na

pravilno podanih povezav, saj so večinoma ostale splošne

koncu izberemo osebo, ki se največkrat pojavlja v besedilu na

povezave, ki so za vse občine podobne in jih je lažje generi-

spletni strani.

rati.

4

OBJEKTIVNA EVALVACIJA REZULTATOV

5

Zaključek

Za evalvacijo rezultatov smo kot učno množico uporabili po-

V tem dokumentu smo opisali postopke avtomatizacije iz-

datke 16 občin, kjer so bile spletne strani ročno klasificirane.

gradnje baze asistentovih odgovorov z uporabo različnih

Klasifikacije za ostale občine nismo testirali, saj še ni popol-

pristopov strojnega učenja ter ekstrakcije informacij iz splet-

noma narejena, a to ne vpliva na testiranje generatorjev, na kar

nih strani. Opisana je osnovna ideja arhiterkture sistema za

smo se mi osredotočili, saj smo najprej želeli preveriti kako

avtomatizacijo gradnje odgovorov virtualnega asistenta, pred-

48

vsem pa so predstavljeni pristopi za generiranje odgovorov in

edge and Data Engineering, let. 18 št. 10, str. 1411-1428,

ekstrakcijo podatkov iz spletnih strani.

2006.

Preizkusili smo del sistema, ki iz spletnih strani prido-

biva informacije in generira kratke odgovore na določena

[6] Jsoup, Java HTML Parser.

URL http://jsoup.

vprašanja. Kvaliteta ustvarjenih odgovorov je zadovoljiva,

org/. Pridobljeno 16. 9. 2014.

82% odstotkov odgovorov je bilo ustreznih, kar ocenjujejmo,

da je dovolj, da olajša človeško delo.

[7] Oblikoslovni označevalnik za slovenski jezik.

URL

Za nadaljnje delo načrtujemo izdelavo klasifikacijskega

http://www.w3schools.com/tags/ref_

modela, ki bo za vsako pravilo poiskal spletno stran občine,

urlencode.asp. Pridobljeno 16. 9. 2014.

na kateri se nahaja ustrezen podatek. Od klasifikacije je zelo

odvisen tudi generator odgovorov, ki smo ga izdelali, saj

[8] HTML URL Encoding Reference

URL http:

lahko deluje precej slabše ob slabši klasifikaciji.

//oznacevalnik.slovenscina.eu/Vsebine/

Sl/ProgramskaOprema/Navodila.aspx.

Pridobljeno 16. 9. 2014.

References

[1] Seznam

osebnih

imen.

URL

[9] LemmaGen, Multilangual Open Source Lemmatisation.

http://sl.

URL http://lemmatise.ijs.si/.

Pridobljeno

wikipedia.org/wiki/Seznam_osebnih_

16. 9. 2014.

imen. Pridobljeno 1. 9. 2014.

[2] Bag-of-words representation of text.

URL https:

[10] libphonenumber, Google’s phone number handling

//inst.eecs.berkeley.edu/˜ee127a/book/

library

URL https://code.google.com/p/

login/exa_bag_of_words_rep.html

Pri-

libphonenumber/. Pridobljeno 16. 9. 2014.

dobljeno 5. 9. 2014.

[11] Uradni list Republike Slovenije URL http://www.

[3] Tf-idf weighting.

URL http://nlp.stanford.

uradni-list.si/. Pridobljeno 18. 9. 2014.

edu/IR-book/html/htmledition/

[12] Projekt Asistent,

Virtualni asistent za občine in

tf-idf-weighting-1.html

Pridobljeno

5.

društva. URL http://www.projekt-asistent.

9. 2014.

si/wp/. Pridobljeno 18. 9. 2014.

[4] InformationExtraction

URL

http://en.

[13] Projekt Asistent,

Virtualni asistent za občine in

wikipedia.org/wiki/Information_

društva,

opis

projekta.

URL

http://www.

extraction Pridobljeno 3. 9. 2014.

projekt-asistent.si/wp/?page_id=100.

[5] C. Chang, M. Kayed, M. R. Girgis in K. F. Shaalan. A

Pridobljeno 18. 9. 2014.

Survey of Web Information Extraction Systems. Knowl-

49



INFERRING MODELS FOR SUBSYSTEMS BASED ON REAL

WORLD TRACES



Rutger Kerkhoff, Aleš Tavčar1,Boštjan Kaluža1

Jožef Stefan Institute, Jamova cesta 39, 1000 Ljubljana, Slovenija1

e-mail: rutger.kerkhoff@gmail.com, {ales.tavcar, bostjan.kaluza}@ijs.si





ABSTRACT

As a proof of concept this article will focus on



simulating a single smart house. While this model will be

Creating simulations for smart cities is a complex and

time consuming task. In this paper we show that using

significantly less complex, and even allow for exact

traditional Bayesian networks and real world data

simulation, it is ideal for showing the power of this

traces it is possible to infer models that can simulate the

approach. In this article we will show that using a manually

original domain. The created model can provide great

defined Bayesian network already allows for quite accurate

insight into the actual subsystems that are considered.

predictions. Expanding to an automatically learned network

We show that given a set of observed values we can

improves these even more, showing great promise for

successfully use the created model to simulate data and

modelling a complete smart city.

show trends present in the original system.

2 RELATED WORK

1 INTRODUCTION

Using a Bayesian network to model a system is not a new

With modern cities becoming more complex and ever

idea, take for example a water supply network [5], where

increasing in size it is of vital importance to control and

the authors try to predict when a pipe will burst. It has also

optimize the different systems present in a city. While the

been used to model mobility within a city [3] or to predict

different systems already available in a city, e.g. the power

when to replace parts of the New York power grid [2].

grid, waste management, bus scheduling etc., can and are

While Bayesian networks have not yet been used to model a

being optimized, the bigger picture has not been explored

complete city, there are approaches which try to tackle large

yet. Connecting them allows further optimisation and

or distributed models. For example a hierarchical object

realises emergent behaviour; something that cannot be

oriented approach [7] or building local networks and

observed when looking at a subsystem alone.

identifying which variables likely link the systems [6].

An important aspect of realising such a system is

While not needed for the model of the smart house

understanding the relations between the subsystems and

discussed in this paper these techniques will likely be

even within the subsystems themselves. A concrete example

necessary when considering a smart city. As an example of

of where this knowledge is needed is simulation. It is of

what possible data streams and variables can be found in a

vital importance to thoroughly test such a city-controlling

smart city one can look at sensor data streams in London

system before applying it, and for that simulations are

[8].

required. A simulation will also allow faster development of

While other papers focus mainly on classification or

new systems and applications within the smart city.

decision support systems [4], we focus on recreating the

However, simulating a smart city is an immensely complex

original data and trends for a smart environment. We are not

task, and in all but the simplest cases it is infeasible to

aware of any complete simulation and creation of a trace

specify all the variables and relations concerned in such a

using a Bayesian network.

simulation. We therefore propose to learn a model,

representing all the variables and relations in the smart city,

3 DOMAIN

from real-world data traces.

The domain explored in this paper is that of a smart house,

To allow for a better understanding of the system and

which is less complex than a city, but it is well suited to

the ability to simulate new traces we use a graphical model,

show the ideas and explore the methodology. The traces

a Bayesian belief network [1] to be exact. In this article we

used to model, learn and simulate the smart house are

will focus on different ways of creating the network and

obtained from the EnergyPlus simulator [14]. The simulator

inferring the probabilities. The ability to graphically

was developed within the OpUS system [13] and it is based

represent the network makes it an ideal tool for

on a real world building. While simulating a simulator

understanding the modelled system. Moreover, we can input

seems pointless, it allows for testing of the model in varying

a set of observed variables and update, according to those

situations without having to gather real world data for an

values, the probabilities of the unobserved which makes it a

extended amount of time. For simulating the city this will be

suitable tool for creating simulations.

necessary.



50

The simulation has 6 input variables, all concerning the

Here x are the different possible outcomes for that node.

environment. One should think of values for outside

Simply counting the occurrences in our training data gives

temperature, wind etc. The simulator also supports setting if

us a probability table for every possible outcome of X

the occupant is present, the house was set to be empty from

depending on every possible outcome of Xi, ...Xj . Because

7:00 to 17:00 every day. Redundant output variables were

not every possible combination is observed we use a prior α

filtered out and we are left with 9 variables, ranging from

= 0.5 as an initial count on each variable.

inside temperature, heating coil power consumption to

As the observant reader might have noticed the equation

electricity produced by a solar panel. All variables are

2 requires discrete variables, and thus we discretized all our

recorded in an interval of 15 minutes and are continuous.

variables into 10 equal sized buckets. The partition intervals

The subsystems that we want to simulate in the city can be

where chosen based on a minimum description length

seen as the different devices in the house, e.g. the solar

principle. For more information and a formal definition we

power generator reacting to an increase in solar radiation.

refer the reader to Fayyad et al. [12].

We input the values of our observed testing variables

4 METHOD

and use this knowledge to calculate the remaining variables.

This is done by the junction tree algorithm [10], it

To learn a model of the house we use Bayesian networks.

calculates the marginal distribution for each unobserved

We decided on using Bayesian networks as they do not only

node based on the values of the observed nodes. We take

perform well when predicting, they also have an

the most likely value for each node, set it as evidence and

understandable structure that can give insight to the

update the margins, till all the variables are set.

situation being modelled. We assume the reader to be

familiar with the subject as described by Heckerman et al.

5 RESULTS

[1]. In this section we will explain the specific parameters

and choices made for our implementation.

In this section we present the experimental results for the

Creating a Bayesian network can be seen as a two stage

models presented in the previous section. In section

process. First, we define a network structure, nodes and how

5.1 we will look at the structure of the networks and in

variables are related as a directed acyclic graph. Second, we

section 5.2 we will analyse the performance of the complete

must set the probabilities distributions of all the different

models.

nodes depending on their ancestors. We define our network

The data used was obtained from running the

as G, our nodes as V and the arcs between the nodes as A.

EnergyPlus simulator. We used two days of simulated data

for training the model and two days for testing. The

G = ( V, A) (1)

variables were all recorded at 15 minute intervals. The

During the first stage the structure of the network can be

concrete implementation was done in Java using the WEKA

defined either manually or automatic. First, we define it

library [11].

manually, leveraging our knowledge of the system to define

5.1 Network Structure

which nodes should be related. An advantage of this is that

we will not get an over-fitted network based on the

We first created a manual structure of the network. Since

coincidences on our training data. A drawback is that we

the selected domain is relatively simple with only 17

might miss relations we did not know beforehand, and that

variables it could be largely comprehended by the analysis

for larger networks defining a structure quickly becomes a

of the data. We looked at correlations between the

complex task.

variables, used our knowledge of the physical processes and

Because the drawbacks will become more significant in

experimented with a few possible networks to get a network

the smart city we also implement an automated approach to

that covered all the variables. Because of computational

learn the structure from the data. We use a local search

challenges some relations were simplified to restrict the

based metric, the “Look Ahead in Good Directions, LAGD,

number of parents of a node. Precautions were taken to

hill-climbing algorithm [9]. It looks at a sequence of best

make sure no relations were lost, for example, the solar

future moves, instead of a single move as is usually done for

radiation influencing the solar panel which in turn

hill climbing. The algorithm to calculate a sequence has

influences the battery. The arc from solar panel to battery

exponential time complexity, and therefore it first computes

was removed and instead it was linked directly to the

the best moves and then decides on a sequence. It scoring

radiation. This goes at the cost of a little accuracy but does

function considers conditional mutual information between

not lose the relation. In Figure 1, a network created using

the nodes. For a formal definition please see Abramovici et

the LAGD hill climber algorithm with at maximum 1 parent

al. [9].

is shown. The figure does not show all the variables, as the

The

second

problem,

learning

the

probability

nodes that do not have any arcs are left out for the sake of

distributions, is solved by calculating direct estimates of the

readability. The reasons why certain variables do not have

conditional probabilities using a set of training data. For a

arcs will be discussed in the next subsection. It is

single node X ∈ V we define the probability that X = x as:

interesting that a lot of arcs seem to be reversed. Note;

however, that setting evidence for any variable will update

P (x) = P (X = x|Xi, ..., Xj ) (2)



51





Figure 1: Network generated by LAGD Hill climber

margins throughout the network, and thus even a reverse

learned network. For the heating coil and the washing

relation is still captured. Automatic network generation can

machine even the baseline has an exceptionally good

model inconsistent relations, for example, Net Purchased

prediction power. This is because they are off in all but a few

Power related to Wind speed seems to be over fitted on a

cases, showing the importance of complete training data. The

coincidence in the data. The other curious relations, like

manual network performed better on a subset of the

Solar Produced and Inside temperature can be explained by

variables, for example power used by lighting. Lighting can

the fact that all the input variables are closely related, on a

be clearly linked to time, this was not found by the LAGD

warm day there will be more solar radiation etc.

algorithm as it does not occur often in the training data.

However, in most cases automatic structure generation did

5.2 Performance

perform better. Some of the interesting variables are the

To test the performance of the networks we computed the

predicted values of inside temperature (Fig. 2b), solar power

percentage of correctly predicted values per variable and

produced (Fig. 2c) and battery charged state (Fig. 2a). Note

the root mean-square-error (RMSE). However, even more

that the predicted data was discretized into buckets and for

important than predicting the right values is to predict the

each datapoint the average of the bucket was used for the

trends in the data. A few erroneous spikes are of lesser

calculation of the RMSE and as data points in the graphs.

importance then missing a trend. Therefore we also plotted

The predicted inside temperature over time shown in

the data and did a visual analysis of the results.

Figure 2b follows the general trends for both models, though



the manual network is over-fitted on outside temperature and



Baseline

Manual

Automatic

clearly performs worse. Another problem is that because of

Variable

%

E

%

E

%

E

the discretisation some small changes are amplified. In

Figure 2c the prediction of power produced by the solar

Battery charge [J]

8

85e5

82

21e4 74

47e4

panel is shown. It is closely correlated to one of the input

Heating coil

99

1.15

99

1.15 99

1.15

variables, solar radiation, and is therefore quite accurately

Washing machine

98

4.27

98

4.27 98

4.27

predicted. The second production increase is not seen in the

Total demand

51 87.54

51

87.54 60

15.17

graph for the manual network as the values are still in the

Total produced

67 84.32

67

84.32 76

28.52

range of the first bucket.

Cooling coil

69 46.49

35

49.45 77

22.25

Predicted battery charge state over time can be seen in

Net purchased

65 35.61

65

35.61 65

35.61

figure 2a. This is difficult to predict as it depends on many

variables. As can be seen the manual network is over-fitted

Solar produced

60 88.31

81

34.58 82

35.37

on a wrong variable; time. The first charge peak happens to

Lights

72 35.53

88

12.46 72

35.53

coincide in the training and testing data and therefore the

Inside temp [C]

34

2.62

30

0.6 72

0.24

percentage of correct prediction is still high. The automatic

Overall

62

70

77



network is not conclusive in predicting a trend. For variables

like a battery, which cannot drop quickly and get back up

Table 1: All variables are power in watt [W] if not stated

again, it will be a valuable extension to consider the previous

otherwise. The columns labelled % depicts the percentage

state as well.

correctly classified. E is the RMSE, lower is better.

In general, the automatically constructed network model

In Table 1 the percentage of correctly predicted values

performed better than the manually constructed one and

and the RMSE for each variable for three separate

possibly correct but yet unknown relations were found. The

classifications is shown. First a baseline where the most

variables could be predicted relatively accurately and most

probable bucket was chosen based on just priors. Second the

trends present in the original data could also be found in the

manually constructed network and last the automatically

generated data.



52



Figure 2: Predictions of system parameters



4 CONCLUSION

[4] Lanini S., Water management impact assessment using a

Bayesian network model, Proceedings of the 7th

We have shown that it is possible to build complex models

Conference on Hydroinformatics, 2006.

from real world traces that model relations between

[5] Babovic V., Drcourt J., Keijzer M., and Hansen P. F., A

subsystems. These models can then be used to simulate the

data mining approach to modelling of water supply

system and generate more data based on a set of input

assets Urban Water, vol. 4, no. 4, 2002.

variables. We have seen that trends in the data can be

[6] Rong C., Sivakumar K. and Kargupta H., Collective

modelled and even single predictions can be used as an

mining of Bayesian networks from distributed

indication of expected data.

heterogeneous data, Knowledge and Information

While automatic generation of a Bayesian network for a

Systems, vol 6, no. 2, 2004.

domain is possible, some expert knowledge will still be

[7] Molina, J. L., John Bromley, J. L. Garca-Arstegui, C.

required to reduce over-fitting due to coincidences in the

Sullivan, and J. Benavente, Integrated water resources

data, and to improve the network. Due to the nature of the

management of overexploited hydrogeological systems

Bayesian networks the cooperation with a domain expert

using

Object-Oriented

Bayesian

Networks,

can easily be established.

Environmental Modelling & Software, vol. 25, no. 4,

The biggest drawback of this method is that it largely

2010.

depends on the available data. A lack of data will lead to

[8] Boyle D., Yates D. and Yeatman E, Urban Sensor Data

incomplete or incorrect models.

Streams: London 2013, IEEE Internet Computing, vol.

The most important direction for future work will focus

17, no. 6, 2013.

on taking the temporal nature of the network into account.

[9] Abramovici M., Neubach M., Fathi M. and Holland A.,

Expanding to dynamic Bayesian networks or Hidden

Competing

fusion

for

bayesian

applications,

Markov Models will allow for an even more accurate

Proceedings of International Conference on Information

prediction of trends. The challenge will be to cover the

Processing, vol. 8, p. 379, 2008.

unknown parameter space that is not directly present in the

[10] Lauritzen S. L., and Spiegelhalter D. J., Local

training data. Introducing Gaussian probabilities to closer

model the values produced by different sensors is another

computations with probabilities on graphical structures

possible direction for future work. A third extension will

and their application to expert systems, Journal of the

handle tackling computational challenges that arise when

Royal Statistical Society. Series B, 1988.

the network sizes increases, those may be solved by creating

[11] Hall M., Frank E., Holmes G., Pfahringer B.,

a more hierarchical network structure.

Reutemann P., Witte I. H., The WEKA Data Mining

Software: An Update, SIGKDD Explorations, vol. 11,

References

no 1, 2009.

[1] Heckerman D., Bayesian networks for data mining, Data

[12] Fayyad U. and Irani K., Multi-interval discretization of

mining and knowledge discovery, vol. 1, no. 1, 1997.

continuous-valued attributes for classification learning,

[2] Rudin C., Waltz D., Anderson N. R., Boulanger A.,

Chambery, France, 1993.

Chow M., Dutta H. Machine learning for the New York

[13] Tavčar A., Piltaver R., Zupančič D., Šef T. and Gams

City power grid, IEEE Transactions Pattern Analysis

M., Modeliranje Navad Uporabnikov Pri Vodenju

and Machine Intelligence, vol. 34, no. 2, 2012.

Pametnih Hiš, Proceedings of Information Society 2013,

[3] Fusco G., Looking for Sustainable Urban Mobility

p 114-117, 2013.

through Bayesian Networks, Cybergeo: European

[14] US Department of Energy, EnergyPlus Energy Simu-

Journal of Geography, 2003.

lation Software, eere.energy.gov/buildings/energyplus/,

accessed 08-2014.



53



INCLUSION OF VISUALLY IMPAIRED IN GRAPHICAL

USER INTERFACE DESIGN



Mario Konecki

Faculty of Organization and Informatics

University of Zagreb

Pavlinska 2, 42000 Varaždin, Croatia

Tel: +385 42 390834

e-mail: mario.konecki@foi.hr





ABSTRACT

readable in this kind of approach but even moderately



large tables are almost impossible to present.

Visually impaired programmers have been included into



programming industry since its very beginning and they



Graphical charts interpretation – Graphical charts are

were able to perform their jobs without difficulties.

usually made of several different sub-elements and

Graphical user interfaces and point and click method of

shapes which cannot be adequately interpreted.



instructing computers have created many difficulties for



Robustness issue – Inability to cope with new

visually impaired programming professionals. Visually

technology and constant software development.

impaired have interest in programming just as everyone

The same problems have emerged in programming domain

else and the means of their inclusion in overall software

where various graphical environments have appeared as

development process are important issues that need to

well as the need to create graphical user interfaces by point

be resolved. One of disadvantages for visually impaired

and click method since textual description of graphical

is the lack of assistive technology that would enable

elements for visually impaired was too complicated and

them to design and create graphical user interfaces. In

virtually impossible in practice [7]. And although this

this paper the GUIDL (Graphical User Interface

problem was not so prominent in the area of web

Description Language) system that is aimed to resolve

programming since its textual coding nature, it was very real

the mentioned issues is presented and discussed.

in the domain of classic desktop programming [6].



Inclusion of visually impaired as equals into all aspects of

1 INTRODUCTION

social and business life remains important issue and



enabling visually impaired to design graphical user

Inclusion of visually impaired in the world of computers

interfaces is one of its aspects. The interest of visually

and programming has been present from the beginning of

impaired for programming today is present and actual [1,

computer mass usage [5]. Visually impaired have been able

12]. There are over 130 blind programmers registered at the

to use computers and perform various programming tasks

American Foundation of the blind programmers and

by using assistive technology in the form of various text-to-

programming is stated as potentially promising carrier

speech synthesizers of which the most well-known are

opportunity for visually impaired in Europe [2, 11].

JAWS, HAL Screen Reader, COBRA, Window Eyes and

Inclusion of visually impaired into overall software

Easy Web Browsing [7]. However, the graphical revolution

development process which includes the design of graphical

in the world of computers has made using computers for

user interfaces is an actual and important issue. Its solution

visually impaired much more difficult since text-to-speech

in a form of GUIDL (Graphical User Interface Description

synthesizers were not able to represent the context and

Language) system is proposed and described in the rest of

organization of graphical screens. Some problems that

this paper.

existing screen reading technologies came across are [7]:







Interpretation of images – Screen readers are not able

2 POSSIBLE APPROACHES TOWARDS SOLUTION

to adequately interpret images. Only properties and



descriptions of images can be presented.

In order to solve the problem of inclusion of visually

 Graphical layout and context – Screen readers read

impaired into the process of overall software development

information in linear way that is not sufficient to

and to enable them to design and create graphical user

interpret complex graphical user interfaces and screen

interfaces several possible approaches can be taken [7]:

organization.

 Interpreters of specific graphical elements and

 Reading of data tables – Because of linear way of

attributes of every development environment could be

reading information small tables are suitable and

created



54

 Audio support for creation of graphical elements could

programming language. Conceptual model of GUIDL

be incorporated into programming environments

system is shown in Figure 1 [8].

 A specific scripting language for every programming



technology and environment could be developed



All mentioned approaches are time consuming and specific



to particular programming language and environment. In



order to provide a more universal solution several



Wrapper/

Final

requirements must be satisfied [8]:



mediator 1

UI

 Easy usage: system has to be simple and easy to use so





code 1

it can be used by programmers but also by designers



and other interested computer users.



GUIDL

 Intuitive, simple and understandable syntax: system’s



language

language that will be used to describe the graphical



elements has to be intuitive, simple to use and easy to



understand.

2 TEXT NORMALI

Wr ZAT

appe I

r O

/ N AND GRAPH

Fina E

l ME-TO-

 Technology independence: system and its language for

PHONEME CONVERS

mediatIO

or N

2

UI



description of graphical user interfaces has to be





code 2





applicable to various programming languages and





development environments, not to be technology





specific.



 Extensibility: system and its language have to be to be

Figure 1: Conceptual model of GUIDL system

extensible so they can include support for new



programming

languages

and

corresponding

In GUIDL system visually impaired programmers start the

development environments as well as new graphical

development of programming language specific graphical

element and attributes.

user interface by defining the interface layout in GUIDL

Mentioned requirements have been evaluated through

using all assistive built-in concepts. After the GUIDL code

conducted research that confirmed stated requirements and

has been written, the GUIDL lexer performs lexical

added some new requirements with the requirement for

analysis. Lexical analysis uses scanner that reads the code

proper documentation being the most frequently mentioned.

character by character and transfers it to lexer which checks



whether received characters form an array that can be

3 GUIDL SYSTEM

identified as an acceptable string or token. The set of



In order to provide a more general solution a GUIDL

acceptable strings or tokens are defined in GUIDL context-

(Graphical User Interface Description Language) system

free grammar [9] G = (V, Σ, R, S) where V is the finite set

of nonterminal symbols, Σ is the finite set of terminal

has been developed with GUIDL language as its core part.

GUIDL language enables visually impaired to define all

symbols, R is the set of substitutions or production rules of

from A → α where A is

graphical elements in one place using language that is

some nonterminal symbol and α is a

simple and has several assistive concepts such as:

string over V ∪ Σ.

 Predefined gradual sizes of forms

S represents the start symbol which is nonterminal. Every

 Predefined gradual sizes of graphical elements

string x  (V ∪ Σ)* which has the form of yAz can be



turned into yαz by using production rule A → α that



Predefined width/height attribute values



substitutes A with α. Set of terminals defines the language's



Division of forms into quadrants



alphabet and Σ  V. The set of possible tokens or the



Possibility to position graphical elements into one of

language itself is defined as L(G) = {q  * : S 

form’s quadrants

G* q} or



all strings of finite length that are composed of zero or more



Possibility to define the position offset of forms



symbols from  in a way that particular string can be



Possibility to define the position offset of graphical

generated from start symbol by using zero or more steps

elements



which are defined by production rules.



Detection of problems with position of graphical

The set of tokens produced by lexer is processed by GUIDL

elements (graphical element out of form boundaries)



parser that compares the order of the tokens against defined



Automatic correction of problems with form’s

grammar rules and in this way conducts the syntax

dimension and position (form out of screen boundaries)

validation of written code. After the syntax is validated the

In this way visually impaired are able to define entire user

parser creates corresponding syntax tree from which the

interface in just one place and one technology and GUIDL

GUIDL generator generates the final graphical user

system then enables them to translate that interface into

interface code for specific programming language and

desired programming technology format which can then be

environment. Described process is shown in Figure 2 [8].

included into native programming environment of chosen



55

GUIDL code

x = a

+ b + c



Lexical analysis

GUIDL lexer



Tokens

id equals id plus id plus id





Syntactic analysis

GUIDL parser



Syntactic tree



GUI c



ode generator



Specific GUI code



Figure 2: Steps in using GUIDL system

3.1 GUIDL syntax

4 GUIDL AS ASSISTIVE TECHNOLOGY





GUIDL syntax has been designed to be simple and easy to

GUIDL system and its language are designed to be simple

use in order to enable visually impaired to perform their

and understandable. Its main purpose is to provide visually

design tasks in quick and efficient manner. Its form is

impaired an assistive technology that would enable them

inspired by the simplicity of BASIC (Beginner's All-purpose

easier creation of designed graphical user interfaces. There

Symbolic Instruction Code) syntax which has been created

are several models that support development of assistive

for beginners and for learning purposes. Partial GUIDL

technology and one of them is CAT (Comprehensive

grammar in EBNF [10] form is given below.

Assistive Technology) [4] that was used in development of



GUIDL system. CAT model is hierarchically structured

project = projectcode, controlname, form;

with four aspects as is shown in Figure 3 [4].

projectcode = 'Project ' | 'project ';



form = formcode, controlname, formattributes,



[controldeclarations], formend;



Context

formcode = 'Frm ' | 'frm ';





formend = ('End' | 'end'), [eol];



controlname = qoute, word, qoute, eol;



Comprehensive

Person

formattributes = frmcommonattributes,



Assistive



windowstateattribute, {colorattribute};



frmcommonattributes = textattribute,

Technology



Activities

frmrestcommonattributes;

model





frmrestcommonattributes = frmsizeattribute,



locationattribute;



Assistive

frmsizeattribute = sizecode, (frmsize | frmwidth, ws



,frmheight), eol;

technology



locationattribute = locationcode, xposition, ws, yposition,



eol;

Figure 3: CAT model

frmsize = 'frmsize1' | 'frmsize2' | 'frmsize3'



locationcode = 'Location = ' | 'Location=' | 'location = ' |

In the case of GUIDL system development, the context is

'location=';

composed of several aspects such as: teams and working

xposition = 'left' | 'center' | 'right';

environment in which visually impaired programmers work,

yposition = 'top' | 'middle' | 'bottom';

overall labor market, the attitude towards visually impaired



56

as programmers, government measures to aid visually

elements. Possible approaches towards solution of

impaired in getting jobs, etc. Person that uses GUIDL is

mentioned problems have been presented in this paper

visually impaired person that wants to be a part of overall

along with GUIDL system as assistive technology that is

software development process and wants to work as a

aimed at including visually impaired in graphical user

programmer of equal opportunities. Activities that are

interface design as an integral part of overall software

supported by GUIDL system are activities of quicker and

development process.

easier development of graphical user interfaces that will

Evaluation of GUIDL system has shown that it is suitable as

enable visually impaired to be included as equals in the

an assistive technology and that it enables visually impaired

work of the development team that they belong to.

to perform the actions of graphical interface design in an

GUIDL system has a meaning of assistive technology [3] in

easier and more suitable manner. Adding new features and

a way that it is designed to overcome the obstacle of

concepts to GUIDL system will be a part of future work.

designing and creating graphical user interfaces in graphical



point and click development environments. GUIDL system

References



isn't build to replace development environments and to

[1] Alexander, S. Blind programmers face an uncertain

provide a specific isolated technology, it is designed to

future, available at

include

visually

impaired

in

actual

programming

http://www.cnn.com/TECH/computing/9811/06/blindpr

technologies’ parts in which they had the most difficulties.

og.idg/index.html, accessed 14th August 2014, 1998.

Visually impaired programmers start their development in a

[2] Cattani, R. The employment of blind and partially-

native programming environment where they set a project.

sighted persons in Italy: A challenging issue in a

Then they use GUIDL to design graphical user interfaces

changing

economy

and

society,

available

at

which are then included in already made project where

http://www.euroblind.org/media/employment/employme

visually impaired programmers continue with development

nt_Italy.doc, accessed: 25th March 2011.

and writing of program code which is something that they

[3] Francioni, J. M.; Smith, A. C. Computer Science

were always been able to do well by using another assistive

Accessibility for Students with Visual Disabilities. ACM

technology in a form of text-to-speech synthesizers.

SIGCSE Bulletin, 34(1):91-95, 2002.

In this way visually impaired become included in the design

[4] Hersh, M. A.; Johnson, M. A. Assistive Technology for

process and creation of graphical user interfaces as well as

Visually Impaired and Blind People. Springer, 2008.

in other segments of overall software development.

[5] Hodson, B. Sixties Ushers in Program To Train Blind



Programmers. Computer World Canada 11, 2004.

4.1 GUIDL system in practice

[6] Konecki, M.; Kudelić, R.; Radošević, D. Challenges of



GUIDL system has been tested on 47 participants that were

the blind programmers. In Proceedings of the 21st

given the GUIDL prototype along with instructions and

Central European Conference on Information and

examples of its use. All participants were given several

Intelligent Systems, pages 473–476, 2010.

practical programming tasks through which they had to

[7] Konecki, M.; Lovrenčić, A.; Kudelić, R. Making

evaluate whether the GUIDL system will provide an

Programming Accessible to the Blinds. In Proceedings

efficient assistive role in their programs development

of the 34th International Convention. Croatian Society

activities. The evaluation of GUIDL system has shown that

for Information and Communication Technology,

it indeed serves well as assistive technology. All performed

Electronics and Microelectronics – MIPRO, pp. 820–

tasks have been reported as easier and quicker when using

824, 2011.

GUIDL system than when using purely native technology.

[8] Konecki, M.: A New Approach Towards Visual

GUIDL system has also been reported as a suitable mean of

Programming for the Blinds. In Proceedings of the 35th

inclusion of visually impaired into activities of overall

International

Convention

on

Information

and

software development that includes design of graphical user

Communication

Technology,

Electronics

and

interfaces. Another important aspect of GUIDL as assistive

Microelectronics - MIPRO, pages 935–940, 2012.

technology is that it enables visually impaired to work in

[9] Lewis, H. R.; Papadimitriou, C. Elements of the Theory

actual technologies rather than having isolated and

of Computation. Prentice Hall, Inc., 1981.

specialized system.

[10] Information technology – Syntactic metalanguage –



Extended BNF. ISO/IEC 14977. Geneva, Switzerland:

5 CONCLUSION

ISO, 1996.



Visually impaired programmers have been a part of

[11] bfi Steiermark, European Labour Market Report,

computer revolution since its very beginning. Graphical

available

at

user interfaces and occurrence of point and click

http://eurochance.brailcom.org/download/labour-

development environments have left visually impaired in

market-report.pdf, accessed: 25th March 2011.

difficult position since existing assistive technology in the

[12] Blind Programming Project in NetBeans IDE, available

form of text-to-speech synthesizers could not cope well

at

http://netbeans.dzone.com/videos/netbeans-ide-for-

enough with rapid development and new graphical

the-blind/, 2010, accessed: 20th August 2014.



57

MINING TELEMONITORING DATA FROM

CONGESTIVE-HEART-FAILURE PATIENTS

Mitja Luštrek1,2, Maja Somrak1,2

1Jožef Stefan Institute, Department of Intelligent Systems

2Jožef Stefan International Postgraduate School

e-mail: {mitja.lustrek, maja.somrak}@ijs.si

ABSTRACT

observational study was carried out in the project with the

intention to generate such knowledge. This paper presents

The Chiron project carried out an observational study

an initial analysis of the data gathered in this study.

in

which

congestive-heart-failure

patients

were

telemonitored in two countries. Data from 1,068

2 DATA FROM THE CHIRON STUDY

recording days of 25 patients were gathered, consisting

of 15 dynamic parameters (measured daily or

2.1 Data gathering and description

continuously) and 49 static parameters (measured once

The data analyzed in this paper were gathered in the period

or a few times during the study). The features derived

from May 2013 to May 2014. The whole study included 38

from these parameters were mined for their association

CHF patients: 19 from the United Kingdom and 19 from

with the feeling of good/bad health. The findings mostly

Italy. However, some of the data were incomplete, so only

correspond to the current medical knowledge, although

the data of 12 patients from the UK and 13 patients from

some may represent new insights.

Italy were included in the analysis. These 25 patients

together provided a total of 1,068 usable recording days.

1 INTRODUCTION

The data consists of 64 parameters carefully selected based

Telemonitoring of patients with chronic diseases is

on their relevance to CHF [7].

becoming technically increasingly feasible, but benefits for

The initial measurements of 49 static parameters were

the patients are not always apparent, nor is it clear how to

taken for each of the patients at the beginning of the study.

make the most of the data obtained this way. In the case of

This data includes general patient information (age, gender,

heart failure, two systematic literature reviews showed

BMI, waist-to-hip ratio, smoking, etc.), their current medical

lower mortality resulting from telemonitoring [1][2], but in

treatments (beta blockers, anti-coagulants, ACE inhibitors,

the trials they reviewed, telemonitoring was mostly

etc.), related health conditions (arrhythmias, hypertension,

compared with conventional care worse than what is offered

diabetes, etc.) and the results of a blood analysis

today. Conversely, two large recent trials showed no benefit

(hemoglobin, lymphocytes, LDL/HDL cholesterol, blood

from telemonitoring [3]. However, the telemonitoring in

glucose, Na and K levels, etc.). Some of these measurements

these two trials was not very advanced – the monitored

were repeated periodically every few weeks to provide up-

parameters were limited and no intelligent computer

to-date information. However, the exact period varied from

analysis was involved. We can conclude from this that as the

patient to patient and roughly half of the patients only had

conventional care improved, so should telemonitoring. One

the measurements taken at the beginning of the study.

way to do so is by using intelligent computer methods on the

During the study, the patients were wearing vital-signs

gathered data, both to save the time of the medical personnel

monitoring equipment [5] for several hours each day. The

who would otherwise have to look at all the data themselves,

equipment consisted of an ECG device, two accelerometers

and to uncover previously unknown relations in the data.

places on the chest and thigh, a body-temperature and a

This paper describes the mining of telemonitoring data

humidity sensor. The ECG recordings were subsequently

from congestive-heart-failure (CHF) patients gathered in the

analyzed to extract the physiological parameters related to

Chiron project [4]. The objective of this project was to

the heart rhythm: heart rate, QRS interval, QT interval, PR

develop a framework for personalized health management

interval, T wave amplitude and R wave amplitude. The

with a focus on telemonitoring. The Chiron patients were

accelerometers continuously provided the patient’s activity

equipped with a wearable ECG, activity, body-temperature,

and energy-expenditure estimation. The temperature and

sweating and sensors. In addition, their blood pressure,

humidity sensors provided the measurements of the skin

blood oxygen saturation, weight, and ambient temperature

temperature and sweating index in five-minute intervals.

and humidity were measured [5]. The data gathered this way

The patients were also provided with a mobile application

was fed into a decision-support system, whose objective was

for generating weekly and daily reports. The patients

to estimate the health risk of the patients [6]. However, since

reported their overall feeling of health with respect to the

there is not enough knowledge on how to associate the

previous day on a daily basis (feeling much worse than

values of the various measured parameters with the risk, an

yesterday, worse, the same, better or much better), and

58

answered 13 questions about their health and well-being on

•

Much worse vs. much better ( MW-MB)

a weekly basis. In addition, they reported measurements of

•

Much worse or worse three times in a row vs. much

systolic and diastolic blood pressure, body mass, blood

better or better three times in a row ( MW3-MB3)

oxygen saturation, and ambient temperature and humidity.

•

Much worse or worse vs. much better or better ( MWW-

These – together with the continuously monitored

MBB)

parameters – are labeled dynamic in Section 3.

•

Much worse vs. everything else ( MW-E)

The study also intended to gather data about hospital

•

Much worse or worse three times in a row vs.

admissions and deaths, but no such events occurred during

everything else ( MW3-E)

the study period. Therefore we decided to use the patients’

•

Much worse or worse vs. everything else ( MWW-E)

self-reports of health instead. The analysis in this paper is

based on the daily questions about the feeling of health.

The majority of the data instances have the class ‘feeling the

same as yesterday’, while very few instances have ‘feeling

2.2 Data preprocessing

much better’ or ‘feeling much worse’. Because of this, the

The ECG and accelerometer data recordings required the

first three classes result in discarding the majority of the

most attention when preprocessing the data prior to the data

instances (only 69, 101 or 285 instances remain), while the

mining. These two types of recordings also generated the

last three use all 1,086 of them. Since classes are

vast majority of all the gathered data.

imbalanced, particularly in the last three cases, we used

The ECG signal was already processed with the Falcon

cost-sensitive

classification,

with

the

costs

of

algorithm [5], producing an output where each heart beat is

misclassifications compensating for the imbalances.

described with an 11-tuple. Because the tuples were not

explicitly separated and some of them are incomplete, it was

3 MINING THE DATA

important to distinguish between them in order to extract the

Since the number of combinations of data-mining

specified parameters. We used R-peaks in the ECG signal to

algorithms, features and classes is huge, we designed a

identify distinct tuples. Additionally, a lot of the data was

three-step data-mining procedure (described in detail in

corrupt or missing, so those parts had to be removed.

Sections 3.1–3.3):

Similar problems occurred when processing the

1. Selection of algorithms that classify the data with a high

accelerometer data. It was not possible to extract the

accuracy and yield understandable models

information about the activity and energy expenditure if a

recording of any one of the axes of either of the two sensors

2. Using the selected algorithms, selection of features that

was missing. If a patient forgot to wear both sensors, or one

classify the data with a high accuracy and are

had an empty battery, the data thus had to be discarded.

understandable

Finally, some data was not uploaded successfully to the

3. Using the selected algorithms and features, selection of

servers due do connection problems, and some data are

classes that result in accurate models

missing as a result of inconsistent patients’ behavior.

At the end of these three steps, we ended up with a number

All of the parameters that were measured continuously

of interesting models, some of which are presented in

were further separated by the main activities of the day:

Section 3.4.

during lying, sitting and moving separately (resulting in

features labeled per_act in Section 3) or during all the

3.1 Selection of algorithms

activities together ( all_act). The ratios of the durations of

In the first step we used MW3-MB3 classes and the avg

these three activities were calculated for each day. For every

subset of dynamic all_act features. We compared several

parameter that was measured continuously or multiple times

algorithms from the Weka suite [8] shown in Table 1. We

per day, the average value ( avg) and standard deviation ( sd)

selected the underlined algorithms for the experiments in

were calculated; the calculations were done for separate

Sections 3.2 and 3.3 due to their accuracy and in the case of

activities and for the whole day.

JRip to have another understandable algorithm.

The key value whose association with the other

Table 1: Comparison of data-mining algorithms

monitored parameters we study in this paper – the overall

feeling of health – was reported by the patients relatively to

Algorithm

Accuracy

the previous day. Since the value is not absolute (e.g.,

Random Forest

79.3 %

feeling well) but relative (e.g., feeling better or worse than

Naive Bayes

77.4 %

yesterday), it is associated with the measurements of both

J48

76.3 %

the current and the previous day. Because of that we

SVM, Puk kernel

74.5 %

introduced features that represent changes of the parameters’

SVM, linear kernel

74.2 %

values with respect to the previous day ( chg). Again, the

SGD

73.8 %

calculations were done for separate activities and for the

Multilayer Perceptron

73.2 %

whole day.

JRip

71.9 %

For the purpose of data mining, classes were assigned to

kNN, k = 1

60.9 %

the data. If each of the five distinct feelings of health

kNN, k = 2

56.2 %

corresponds to one class, the differences between them are

kNN, k = 3

47.8 %

too small. Therefore we decided to have only two classes:

SVM, RBF kernel

40.1 %

59



3.2 Selection of features

3.3 Selection of classes

We first compared predefined features sets described in

We compared the accuracies of different classes on all the

Section 2. Since the number of combinations is large, we

algorithms selected in Section 3.1 and all the features

proceeded in several sub-steps. First, we compared subsets

selected in Section 3.2. In Table 4 we report the F-measure

of dynamic all_act features, finding that only avg and avg +

for the Random Forest algorithm (most accurate overall),

chg subsets performed better than the rest. The results are

averaged over all the features. The F-measure was chosen

shown in the first segment of Table 2 with the highest

because of the class imbalance, particularly for the three ‘vs.

accuracy for each algorithm in bold. Second, we added

everything else’ pairs of classes. One can see that MW3-

per_act features to these two subsets of features, finding the

MB3 performed best, probably because it strikes the best

extended features worse than all_act features alone (second

balance between the difference between the two classes in

segment of Table 2). And third, we combined these two

the pair, and the number of instances in the dataset. MW-

subsets of features with static features, finding them best of

MB may have too few features, while in the other cases the

all (third segment of Table 2). However, given the small

difference between the two classes is too small.

number of patients, it is likely that the static features

Table 4: Comparison of classes

identified individual patients instead of taking into account

their general characteristics. Because of that we retained all

Classes

MW-

MW3-

MWW

MW-

MW3- MWW-

the underlined features for experiments in Section 3.3.

MB

MB3

-MBB E

E

E

F-measure

0.77

0.79

0.66

0.55

0.56

0.61

Table 2: Comparison of predefined feature sets

Instances

69

101

285

1,068

1,068

1,068

Algorithm

m

3.4 Interesting models



e

,

o

es





d



iv

M

ip



n

Classification models were built with the J48 and JRip

a

y

k

a

rest

V

u

8

R

4

a

o

Features

N

B

S

P

J

J

R

F

algorithms (being the most understandable of the five

Dynamic, all_act, avg + chg

75.5

80.0

70.6

76.9

80.3

selected in Section 3.1) on all the features selected in

Dynamic, all_act, avg

77.4

74.5

71.9

76.3

79.3

Section 3.2. Two examples are presented in Figure 1 and

Dynamic, all_act, avg + sd

75.3

73.1

70.9

73.3

77.7

Figure

2.

They

show

that

a

high

heart

rate

Dynamic, all_act, avg + chg + sd

74.0

78.7

70.3

75.2

78.3

( HR_avg_all_activities in the figures) and short QRS

Dynamic, all_act, chg + sd

67.1

78.6

64.6

64.9

71.9

interval ( QRS_avg_all_activities, a feature of the ECG

Dynamic, all_act, chg

62.1

71.2

55.5

64.8

64.4

signal) are associated with the feeling of good health, which

Dynamic, all_act, sd

58.2

65.4

63.0

64.6

66.9

corresponds to the existing medical knowledge. Increased

Dynamic, all_act + per_act, avg

77.0

72.5

71.6

75.7

78.4

Dynamic, all_act + per_act, avg + chg

73.4

71.8

71.0

76.7

79.1

weight ( DRWChg) is associated with bad health, which

Dynamic + static, all_act, avg

77.5

79.2

75.5

76.4

79.3

makes sense, since it often signifies excess fluid retention, a

Dynamic + static, all_act, avg + chg

77.8

80.4

77.0

79.6

80.5

common problem of CHF patients. Low humidity ( HumA)

and decrease in humidity ( HumAChg) are associated with

We also tested automatic feature selection methods from the

good health, which matches the medical opinion that CHF

Weka suite. None of the methods performed well on its

patients often badly tolerate humid weather, although there

own, so we used the features selected by at least two

is little hard evidence for this. Oxygen saturation ( DRS)

methods out of the following: Correlation-based Feature

below 97 % is associated with bad health in the second

Subset, Gain Ratio, ReliefF, Symmetrical Uncertainty and

model, which is normal, since the saturation in healthy

Wrapper (the end result of the Wrapper approach was the

individuals is 96 % – 100 %. Finally, the first model

union of features selected when each of the five algorithms

associates high systolic blood pressure ( SBP) and the second

selected in Section 3.1 were used). As the starting point, we

low diastolic blood pressure ( DBP) with good health. This is

used all features, all dynamic features, and avg + chg subset

expected in CHF patients, since their hearts have problems

of all_act dynamic features. The results in Table 3 show that

pumping out enough blood (low systolic blood pressure) as

the first and third of these starting points resulted in the best

well accepting enough blood (high diastolic blood pressure).

models obtained so far, although we retained all the

underlined features for the experiments in Section 3.3.

Table 3: Comparison of automatic feature selection

Algorithm

m



e

,

o

es





d



iv

M

ip



n

a

y

k

a

rest

V

u

8

R

4

a

o

Features

N

B

S

P

J

J

R

F

All features, FS

75.5

80.0

70.6

76.9

80.3

Dynamic, all_act, avg + chg, FS

77.4

74.5

71.9

76.3

79.3

Dynamic + static, all_act, avg + chg

75.3

73.1

70.9

73.3

77.7



Dynamic, all_act, avg + chg

74.0

78.7

70.3

75.2

78.3

Dynamic, all_act, avg

67.1

78.6

64.6

64.9

71.9

Figure 1: J48 classification tree on the avg subset of all_act

Dynamic, FS

62.1

71.2

55.5

64.8

64.4

dynamic features

60



heart failure patients. Journal of the American College

of Cardiology 54, 2009, pp. 1683–1694.

[2] S. C. Inglis, R. A. Clark, F. A. McAlister, S. Stewart, J.

G. Cleland. Which components of heart failure

programmes are effective? A systematic review and

meta-analysis of the outcomes of structured telephone

support or telemonitoring as the primary component of

chronic heart failure management in 8323 patients:

abridged Cochrane Review. European Journal of Heart

Failure 13, 2011, pp. 1028–1040.



[3] C. Sousa, S. Leite, R. Lagido, L. Ferreira, J. Silva-

Figure 2: J48 classification tree on the avg + chg subset of

Cardoso, M. J. Maciel. Telemonitoring in heart failure:

all_act dynamic features

A state-of-the-art review. Revista Portuguesa de

Cardiologia 33 (4), pp. 229–239.

4 CONCLUSION

[4] Chiron project. http://www.chiron-project.eu/.

[5] E. Mazomenos, J. M. Rodríguez, C. Cavero, G.

Telemonitoring can provide huge quantities of medically

Tartarisco, G. Pioggia, B. Cvetković, S. Kozina, H

relevant data, which has the potential to revolutionize the

Gjoreski, M. Luštrek, H. Solar, D. Marinčič, J. Lampe,

care of patients with chronic diseases. However, before this

S. Bonfiglio, K. Maharatna. Case Studies. In System

can happen, the data must be properly interpreted, for which

Design for Remote Healthcare, 2014, pp. 277–332.

the current knowledge is not yet entirely adequate. This

[6] M. Luštrek, B. Cvetković, M. Bordone, E. Soudah, C.

paper presents the data gathered by telemonitoring of CHF

Cavero, J. M. Rodríguez, A. Moreno, A. Brasaola, P. E.

patients, and the first attempt to uncover interesting relations

Puddu. Supporting clinical professionals in decision-

in the data by data mining. A systematic procedure for the

making for patients with chronic diseases. Proc. IS

selection of appropriate data-mining algorithms, features

2013, pp. 126–129.

and classes was designed, whose output were a number of

[7] P. E. Puddu, J. M. Morgan, C. Torromeo, N. Curzen,

models associating telemonitored parameters with the

M. Schiariti, S. Bonfiglio. A clinical observational

feeling of good or bad health. The models correspond quite

study in the Chiron project: Rationale and expected

well to the current medical knowledge, which demonstrates

results. In Impact Analysis of Solutions for Chronic

the validity of our approach.

Disease Prevention and Management, 2012, pp. 74–82.

In the future, we need to solve the technical difficulties

[8] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P.

with extracting the ECG parameters and compute some new

Reutemann, I. H. Witten. The WEKA data mining

features that may be relevant (e.g., QT interval prolongation,

software: An update. SIGKDD Explorations 11 (1),

a feature of the ECG signal that is known to be associated

2009, pp. 10–18.

with cardiovascular problems). Furthermore, the models



resulting from data mining must be carefully examined by

cardiologists, both the models presented in the paper and

others. Those that contain hitherto unknown relations may

be even more important than those that correspond to the

current medical knowledge, since the relations in them may

yield new and important insights. Finally, it would be

desirable to study data that contain events such as hospital

admissions or even deaths, since the findings on such data

would be more reliable than on data that only contains self-

reported feeling of health. However, another observational

study would be needed for that, which is a difficult

proposition that would require substantial funding.

Acknowledgement

This research described in this paper was carried out in the

Chiron project, which was co-funded by the ARTEMIS

Joint Undertaking (grant agreement # 2009-1-100228) and

by national authorities.

References

[1] C. Klersy, A. De Silvestri, G. Gabutti, F. Regoli, A.

Auricchio. A meta-analysis of remote monitoring of

61

APPROXIMATING DEX UTILITY FUNCTIONS WITH

METHODS UTA AND ACUTA

Matej Mihelč ić 1,3, Marko Bohanec2

1 Ruđer Bošković Institute, Division of Electronics, Laboratory for Information Systems, Croatia 2 Jožef Stefan Institute, Department of Knowledge Technologies, Jamova 39, Ljubljana, Slovenia 3 Jožef Stefan International Postgraduate School, Jamova 39, Ljubljana, Slovenia

e-mail matej.mihelcic@irb.hr, marko.bohanec@ijs.si

ABSTRACT

numerical ones in some suitable way: first, the newly

obtained numerical evaluations would facilitate an easy

DEX is a qualitative multi-criteria decision analysis

ranking and comparison of alternatives, especially those that

(MCDA) method, aimed at supporting decision makers

are assigned the same class by DEX; second, the sheer form

in evaluating and choosing decision alternatives. We

of numerical functions may tell us more about the properties

present results of a preliminary study in which we

of underlying DEX functions, which make them useful for

experimentally assessed the performance of two well-

verification, representation and justification of DEX models.

known MCDA methods UTA and ACUTA to

There have already been several attempts to approximate

approximate qualitative DEX utility functions with

DEX utility functions with numeric ones for various

piecewise-linear marginal utility functions. This is seen

purposes. A linear approximation method is commonly used

as a way to improve the sensitivity of qualitative models

in DEX to assess the importance (weights) of criteria [4, 5].

and provide a better insight in DEX utility functions.

An early method for ranking of alternatives and improving

The results indicate that the approach is in principle

the sensitivity of evaluation has been proposed in [6] and is

feasible, but at this stage suffers from problems of

now referred to as QQ [7]. Recently, extensive research has

convergence, insufficient sensitivity and inappropriate

been carried out to approximate DEX functions with copulas

handling of symmetric functions.

[7, 8]. However, no known attempts have been made so far

to approximate DEX functions with piecewise-linear

1 INTRODUCTION

marginal utility functions, as provided by UTA.

Multi criteria decision analysis (MCDA) [1] is an approach

The aim of this study was to experimentally assess the

concerned with structuring and solving decision problems

performance of UTA and its variant, ACUTA [11], on a

involving multiple criteria. MCDA provides a number of

collection of typical DEX functions. The experiments were

methods [2] to create a decision model from information

carried out using two software tools: DEXi [4] to develop

provided by the decision maker. This information can be

DEX functions and Decision Deck [12] to run (AC)UTA.

given in many ways, for instance by constructing evaluation

functions directly, by providing parameters (such as criteria

2 METHODS AND TOOLS

weights) to some predefined functions, by giving examples

2.1 DEX and DEXi

of decisions, or by pairwise comparison of a subset of

decision alternatives. Methods also differ in the

DEX [3] is a qualitative MCDA method for the evaluation

representation of this information (e.g., quantitative or

and analysis of decision alternatives, and is implemented in

qualitative) and their primary aim (choosing the best

the software DEXi (http://kt.ijs.si/MarkoBohanec/dexi.html)

alternative,

ranking

several

alternatives,

classifying

[4]. In DEX, a decision model consists of hierarchically

alternatives into predefined discrete classes, etc.).

structured

attributes:

the

hierarchy

represents

the

Bridging the gap between different MCDA methods is

decomposition of the decision problem into smaller sub

sometimes highly desirable and may have a great practical

problems, and attributes at higher levels of the hierarchy

value. In this work, we try to combine two MCDA methods:

depend on those on lower levels. Figure 1 (left) shows an

DEX and UTA. DEX [3] is a qualitative method; it employs

example of a tree of attributes for evaluating cars [4].

discrete attributes and discrete utility functions defined in a

In the context of this paper, it is important to understand

point-by-point way (see section 2.1). This makes DEX

that all attributes in DEX models are qualitative and can

suitable for classifying decision alternatives into discrete

take values represented by words; for instance, the attribute

classes. On the other hand, UTA [9, 10] is a quantitative

PRICE in Figure 1 can take the values high, medium and

method that constructs numerical additive utility functions

low. Furthermore, the aggregation of attributes at some level

from a provided subset of alternatives (see section 2.2).

in the tree is defined by decision tables that consist of

This work is motivated by the expectation that DEX’s

elementary decision rules. For example, the table in Figure 1

functionality would have been substantially enhanced if we

(right) defines the aggregation of two lower-level attributes

were able to convert its discrete utility functions to

PRICE and TECH.CHAR into the higher-level attribute

62



CAR: the values of CAR are specified for all combinations

3. YW: defined on the same space as YM, it represents an

of values of PRICE and TECH.CHAR. Essentially, this

asymmetric DEX function defined with weights [4]; the

means that utility functions in DEX are discrete and defined

weights assigned to the three arguments are 60%, 30%

in a point-by-point way. This is illustrated in Figure 2,

and 10%, respectively.

which graphically represents the same function as in Figure

All these functions are defined completely for all

1, so that each row of Figure 1 is represented by a dot in

combinations of values of their arguments.

Figure 2. The connecting lines are used only for

2.2 UTA and ACUTA

visualization and are not part of function definition.

The UTA method (UTilité Additive) [9,10] is used to assess

utility functions which aggregate multiple criteria in a

composite criterion used to rank the alternatives. Similarly

as DEX, it uses a subjective ranking on a subset of the

alternatives. On this basis, it creates piecewise-linear

marginal utility functions.

For a set of alternatives ܣ , ܽ ∈ ܣ , numerical criteria

݃ ൌ ሺ݃ଵ, ݃ଶ, … , ݃௡ሻ,	 and the utility function ܷሺ݃ሻ ൌ

ܷሺ݃ଵ, ݃ଶ, … , ݃௡ሻ,	 the marginal utility functions ݑ௜ are

approximated with:

௃

௃

݃௜ሺܽሻ െ ݃௜

௃ାଵ

௃



ݑ௜ሾ݃௜ሺܽሻሿ ൌ ݑ௜൫݃௜ ൯ ൅

ሻ െ ݑ

ሻሿ

݃௃ାଵ

௃ ሾݑ௜ሺ݃௜

௜ሺ݃௜

௜

െ ݃௜

Figure 1: A DEX model and a utility function example [4].

It is assumed that each attribute’s values are divided to

CAR

α

௃

௃ାଵ

௜ െ 1	equally-sized intervals ሾ݃௜ , ݃௜

ሿ.

The marginal utility functions ݑ௜ are constructed by

solving the linear programming problem

min F ൌ ∑ୟ∈୅ σሺaሻ

exc

under	the	constraints:

௡

෍ ݑ

good

௜ሾ݃௜ሺܽሻሿ െ ݑ௜ሾ݃௜ሺܾሻሿ ൅ ߪሺܽሻ െ ߪሺܾሻ ൒ ߜ, ܾܽܲ

௜ୀଵ

௡

acc

෍ ݑ௜ሾ݃௜ሺܽሻሿ െ ݑ௜ሾ݃௜ሺܾሻሿ ൅ ߪሺܽሻ െ ߪሺܾሻ ൌ 0, ܽܫܾ

exc

௜ୀଵ

unacc

low

ݑ ௃ାଵ

௃

௜൫݃௜

൯ െ ݑ௜൫݃௜ ൯ ൒ ݏ௜, ∀݅ ∈ ሼ1, … , ݊ሽ, ܬ ∈ ሼ1, … , ߙሽ

good

௡

medium

acc

TECH.CHAR.

PRICE

෍ ݑ ∗

௜ሺ݃௜ ሻ ൌ 1

high





b

a d

௜ୀଵ



ݑ

௃

௜ሺ݃௜∗ሻ ൌ 0, ݑ௜൫݃௜ ൯ ൒ 0, ߪሺܽሻ ൒ 0,

Figure 2: Graphical presentation of the CAR decision table.

∀݅ ∈ ሼ1, … , ݊ሽ, ܬ ∈ ሼ1, … , ߙሽ, ∀ܽ ∈ ܣ

Here, ߪሺܽሻ denotes potential error relative to the starting

Formally, a DEX utility function is defined over a set of

utility ܷሾ݃ሺܽሻሿ. ݃∗ and ݃

criteria ݔ

௜

௜∗ denote the high and low bounds

ଵ, ݔଶ, … , ݔ௡, where all criteria are discrete and can

of ݃

take values from the corresponding value scales ܦሺݔ

௜ respectively. ܲ and ܫ respectively denote strict

௜ሻ. A

preference and indifference relations.

utility function ܷ maps ݔ to the higher-level attribute ݕ:



ܷ: ܦሺݔ

In some cases there can be many utility functions that can

ଵሻ ൈ ܦሺݔଶሻ ൈ ⋯ ൈ ܦሺݔ௡ሻ → ܦሺݕሻ

ܷ

represent the preferences specified. The utility functions are

is represented by a decision table that consists of

then assessed by means of post-optimality analysis [9].

elementary decision rules, where each rule defines the value

The ACUTA method [11] offers an improvement upon

of ܷ for some combination of argument values:

〈ݔ

UTA. It proceeds by finding an analytic center of the

ଵ, ݔଶ, … , ݔ௡〉 	 → ݕ

additive value functions that are compatible with some user

For experiments in this study, we used a number of DEX

assessments of preferences. In this way, ACUTA solves the

utility functions, but in this paper we will present only three:

model selection problem present in the UTA method when

1. CAR function, as defined in Figures 1 and 2;

there are multiple valid solutions. Similarly as UTA, it

2. YM: defined over three attributes ( ݊ ൌ 3ሻ , all the

constructs marginal utility functions by solving a

attributes have five values. The function is symmetric

constrained optimization problem, see [11] for details.

and represents a very common DEX function, which

In order to approximate DEX utility functions with

behaves as min	ሺݔଵ, ݔଶ, ݔଷሻ when any of the arguments

(AC)UTA, we mapped qualitative DEX attributes ݔ ∈ ܦሺݔሻ

takes the lowest possible value, and as a qualitative

to equidistant numerical scales ݃ ൌ ሾ1, |ܦሺݔሻ|ሿ.

average of ݔଵ, ݔଶ and ݔଷ otherwise.

63





2.3 Decision Deck and Diviz

would allow the method to converge. We used the inverse

The Decision Deck (http://www.decision-deck.org/project/)

DEX attribute label score as a priori rank for the UTA

is a project aimed at developing an open-source MCDA

method. As a result, all the alternatives with the same DEXi

software platform [12]. Diviz is a software component

label score were indifferent for UTA. This required us to

developed in Decision Deck aimed at designing, executing

take only a small, targeted subset of available a priori ranks.

and sharing MCDA methods, algorithms and experiments

Overall, the results produced by UTA were poor and did

[12]. Diviz enables combining programs that implement

not accurately approximate input functions.

MCDA algorithms in a modular way and connecting them in

The ACUTA method performed much better and the

terms of workflows.

models were built on the whole domain of the DEX utility

functions. However, we did experience convergence issues

when using inverse DEX label attribute score as a priori

rank for all the alternatives, so we had to take a subset that

allowed the method to converge. The convergence error

message reported by ACUTA was as follows:

Error

-

failed

to

converge,

due

to

bad

information. Please check your data, rescale the

problem, or try with less constraints.





Figure 4: ACUTA results for DEX function Car.

Figure 3: The ACUTA decision support workflow.

Figure 3 shows the workflow used in this study to run

ACUTA. The input consists of six datasets. The Criteria file

contains names and ID's of the decision criteria, the

Alternatives file contains names and ID's of the alternatives,

the PerformanceTable file contains the attribute values for

each alternative, the AlternativeValues file contains a

ranking of a small sample of the alternatives determined by

the decision maker (usually called a priori ranking), the

PreferenceDirection file indicates preferred optimization

direction, and the NumberofSegments file defines the

number of segments to which the attribute values are split.

The output of the workflow is a rank of alternatives given



their attribute values and the a priori ranking.

Figure 5: ACUTA results for DEX function YM.

3 RESULTS

The results for the Car utility function are shown in

Figure 4, where g1 and g2 indicate DEX attributes PRICE

Several problems were detected when we attempted to

and TECH.CHAR. Both marginal utility functions properly

approximate DEXi utility functions with (AC)UTA in Diviz.

increase, and in g1 the relations between utility values in

First, the standard UTA method could not handle DEX

points 1, and 2 appear right, however utility value in point 3

utility functions and returned an error message:

is too high. We noticed similar behavior in function g2.

Execution terminated, but no result were produced:

you probably hit a bug in the service. […]

Figure 5 shows results for the DEX function YM. In our

In order to get any results, we had to take only a subset of

opinion, marginal utility functions approximate YM quite

the rules, that is, remove a subset of entries from the UTA

well, however they indicate a common problem encountered

performance table.

in the experiments: YM is symmetric, therefore ACUTA’s

The second problem with UTA was setting the a priori

marginal functions should be equal to each other, but they

alternative ranking (i.e., the target attribute) in a way that

64



are not. In this way, the resulting representation does not

In future work, we wish to theoretically and empirically

properly capture the symmetricity of the original function.

address these issues and alleviate these problems, either by

Marginal utility functions in Figure 6 correctly indicate

adopting some other method from the rich set of UTA-

that YW is asymmetric and, observing function’s maximum

related methods [10], by adapting (AC)UTA to specific

values, that the attributes g1, g2, and g3 are less and less

properties of DEX functions, or by developing entirely new

important. However, some sections of these functions are

methods. Eventually, the method should be able to deal with

almost constant, which does not hold in the original

all type of DEX functions, including large ones,

function.

incompletely defined ones and those defined with



distributions of classes.

References

[1] Ehrgott M., Figueira J.R., Greco S.: Trends in Multiple

Criteria Decision Analysis, International Series in

Operations Research & Management Science, Vol. 142,

New York: Springer, 2010.

[2] Figueira J.R., Greco S., Ehrgott M.: Multiple Criteria

Decision Analysis: State of the Art Surveys, Boston:

Springer, 2005.

[3] Bohanec M., Rajkovič V., Bratko I., Zupan B.,

Žnidaršič M.: DEX methodology: Three decades of



qualitative multi-attribute modelling. Informatica 37,

Figure 6: ACUTA results for DEX function YW.

49–54, 2013.

[4] Bohanec M.: DEXi: Program for Multi-Attribute

Decision Making, User's Manual, Version 4.00

4 CONCLUSION

. IJS

Report DP-11340, Ljubljana: Jožef Stefan Institute,

In this preliminary study we tried to approximate several

2013.

DEX utility functions by using the basic UTA method and

[5] Bohanec M., Zupan B.: A function-decomposition

its derivative, ACUTA. In general, the approach turned out

method for development of hierarchical multi-attribute

to be feasible, producing marginal utility functions from

decision models. Decision Support Systems 36, 215–

DEX utility functions, which are defined by points in a

233, 2004.

discrete multidimensional space. The obtained functions are

[6] Bohanec M., Urh B., Rajkovič V.: Evaluating options

easy to interpret and do provide useful information about

by combined qualitative and quantitative methods. Acta

DEX attributes and scales (e.g., numeric utility value for

Psychologica 80, 67–89, 1992.

each discrete attribute value), and the underlying DEX

[7] Mileva-Boshkoska B., Bohanec M.: A method for

utility functions (e.g., about relative importance of

ranking non-linear qualitative decision preferences

attributes). Therefore, the approach is useful for representing

using copulas. International Journal of Decision

and understanding DEX utility functions: the representation

Support System Technology 4(2), 42–58, 2012.

consists of a set of additive utility functions that represent

[8] Mileva-Boshkoska B., Bohanec M., Boškoski P., Juričić

attribute trends and importance’s that cannot be easily

Đ.: Copula-based decision support system for quality

observed by examining DEX utility functions themselves.

ranking in the manufacturing of electronically

On the other hand, we encountered several problems with

commutated

motors.

Journal

of

Intelligent

the methods and their implementation. UTA rarely gives any

Manufacturing, doi: 10.1007/s10845-013-0781-7, 2013.

results on the original DEX functions, and even after

[9] Jacquet-Lagreze E., Siskos J.: Assessing a set of

tweaking the inputs the results were unsatisfactory. ACUTA

additive utility functions for multicriteria decision-

performs much better, it can work on the whole domain of

making, the UTA method, European Journal of

the DEX function, but the a priori rank subset needs to be

Operational Research 10(2), 151–164, 1982.

carefully chosen in order to avoid convergence problems.

[10] Siskos Y., Grigoroudis E., Matsatsinis N.F.: UTA

The theoretical reasons for convergence problems of these

methods. In: Multiple Criteria Decision Analysis: State

methods are still to be determined.

of the Art Surveys, 297–343, Boston: Springer, 2005.

Marginal utility functions, generated by ACUTA, in

[11] Bous G., Fortemps P., Glineur F., Pirlot M.: ACUTA: A

principle appropriately represent the marginal behavior of

novel method for eliciting additive value functions on

DEX attributes, but they exhibit two common problems:

the basis of holistic preference statements, European

•

insufficient sensitivity to changes of attribute values

Journal of Operational Research 206(2), 435–444,

(some sections of ACUTA functions are (almost)

2010.

constant even though the underlying DEX function is

[12] Ros J.C.: Introduction to Decision Deck–Diviz:

not);

Examples and User Guide, Technical report DEIM-RT-

•

inappropriately representing symmetric DEX functions

11-001, Tarragona: Universitat Rovira i Virgili, 2011.

with mutually different marginal utility functions.

65

COMPARING RANDOM FOREST AND GAUSSIAN PROCESS

MODELING IN THE GP-DEMO ALGORITHM

Miha Mlakar, Tea Tušar, Bogdan Filipič

Department of Intelligent Systems, Jožef Stefan Institute and

Jožef Stefan International Postgraduate School, Jamova cesta 39, SI-1000 Ljubljana, Slovenia

e-mail: {miha.mlakar, tea.tusar, bogdan.filipic}@ijs.si

ABSTRACT

is accurate, GP-DEMO finds high-quality results with a low

number of exact solution evaluations, while if it is not, GP-

In surrogate-model-based optimization, the selection of

DEMO needs more exact solution evaluations to achieve sim-

an appropriate surrogate model is very important. If so-

ilar results.

lution approximations returned by a surrogate model are

Since the accuracy of the surrogate model in surrogate-

accurate and with narrow confidence intervals, an algo-

model-based optimization is crucial, we decided to apply two

rithm using this surrogate model needs less exact solu-

different modeling techniques and compare their approxima-

tion evaluations to obtain results comparable to an algo-

tions to determine which one is more suitable for use in a

rithm without surrogate models. In this paper we com-

surrogate-model-based algorithm. In addition to Gaussian

pare two well known modeling techniques, random forest

process (GP) modeling that is used in GP-DEMO, we used

(RF) and Gaussian process (GP) modeling. The compar-

random forest (RF) for comparison. The reason for choos-

ison includes the approximation accuracy and confidence

ing RF was the fact that the methodology is well-known and

in the approximations (expressed as the confidence inter-

that the solutions approximated with this method in addition

val width). The results show that GP outperforms RF and

to approximated values return also confidence intervals.

that it is more suitable for use in a surrogate-model-based

The structure of this paper is as follows. In Section 2, we

multiobjective evolutionary algorithm.

present how the comparison of RF and GP modeling tech-

niques was carried out. In Section 3, we discus the results

1





INTRODUCTION


gained with both techniques, compare them and determine

which technique performs better. Section 4 concludes the pa-

One of the most effective ways to solve problems with multi-

per with an overview of the work done.

ple objectives is to use multiobjective evolutionary algorithms

(MOEAs). The MOEAs draw inspiration from optimization

2

COMPARISON OF RF AND GP SURROGATE

processes occuring in nature and perform many solution eval-

MODELS

uations to find high-quality solutions. Due to the high number

of solution evaluations the MOEAs are not very suitable for

In this section we compare random forest and Gaussian pro-

computationally expensive optimization problems where ex-

cess modeling techniques used for solution approximations.

act solution evaluation takes a lot of time. In order to obtain

The aim of the comparison is to determine which of the two

the results of such problem more quickly, we usually use sur-

techniques is more suitable for use in surrogate-model-based

rogate models to approximate the objective functions of the

optimization.

problem.

To test the two techniques, we used relations under uncer-

But due to inaccurate approximations, the solution com-

tainty to compare their approximated solutions. If two so-

parisons can be incorrect, which can result in very good so-

lution approximations had overlapping confidence intervals,

lutions being discarded. In order to minimize the impact of

we, in order to determine their relation, exactly evaluated one

incorrect comparisons, we defined the relations under uncer-

solution and compared the solutions again. Together with the

tainty ([5]) for comparing approximated solutions presented

number of these additional exact evaluations, we measured

with an approximated value and a confidence interval. By

also the number of incorrect solution comparisons and the

including the confidence interval in the comparison we were

width of the confidence intervals.

able to consider this additional information and minimize the

In addition to using relations under uncertainty, we also

number of incorrect comparisons.

compared the approximated solutions with Pareto dominance

We used these relations under uncertainty in the algo-

relations and measured the number of incorrect comparisons.

rithm called Differential Evolution for Multiobjective Opti-

With Pareto dominance relations the confidence intervals are

mization based on Gaussian Process modeling (GP-DEMO)

not included in the comparisons, so in general, the number of

[4]. We discovered that the quality of the gained result de-

incorrect comparisons hints at the accuracy of the approxima-

pends greatly on the surrogate model. If the surrogate model

tions.

66

Table 1: Comparison of the relations under uncertainty and Pareto dominance relations for GP modeling on the Poloni problem Relation

Solutions used for

Number of

Incorrect

Number of comparisons with

Proportion of confidence

Confidence

type

surrogate model

comparisons

comparisons

confidence interval reductions

interval reductions [%]

interval width

20

1,515

3,635,805

92

26.25

Relations

30

682

3,152,124

80

15.41

under

50

3,940,200

138

1,218,337

31

1.29

uncertainty

100

65

672,384

17

0.012

200

13

549,380

14

0.002

20

367,684

/

/

26.25

Pareto

30

159,945

/

/

15.41

dominance

50

3,940,200

22,032

/

/

1.29

relations

100

2,309

/

/

0.012

200

1,219

/

/

0.002

Table 2: Comparison of the relations under uncertainty and Pareto dominance relations for GP modeling on the OSY problem Relation

Solutions used for

Number of

Incorrect

Number of comparisons with

Proportion of confidence

Confidence

type

surrogate model

comparisons

comparisons

confidence interval reductions

interval reductions [%]

interval width

20

74,181

2,289,682

58

42.81

Relations

30

21,861

1,934,212

49

25.98

under

50

3,940,200

19,342

1,426,775

36

25.05

uncertainty

100

144

712,298

18

0.07

200

152

271,821

7

0.03

20

336,049

/

/

42.81

Pareto

30

136,357

/

/

25.98

dominance

50

3,940,200

49,790

/

/

25.05

relations

100

1,736

/

/

0.07

200

1,453

/

/

0.03

Table 3: Comparison of the relations under uncertainty and Pareto dominance relations for the GP modeling on the SRN problem Relation

Solutions used for

Number of

Incorrect

Number of comparisons with

Proportion of confidence

Confidence

type

surrogate model

comparisons

comparisons

confidence interval reductions

interval reductions [%]

interval width

20

7,407

2,703,783

69

50.03

Relations

30

16

2,338,535

59

0.074

under

50

3,940,200

2

749,258

19

0.099

uncertainty

100

3

359,952

9

0.022

200

11

183,625

5

0.009

20

188,401

/

/

50.03

Pareto

30

161

/

/

0.074

dominance

50

3,940,200

543

/

/

0.099

relations

100

645

/

/

0.022

200

648

/

/

0.009

Table 4: Comparison of the relations under uncertainty and Pareto dominance relations for RF modeling on the Poloni problem Relation

Solutions used for

Number of

Incorrect

Number of comparisons with

Proportion of confidence

Confidence

type

surrogate model

comparisons

comparisons

confidence interval reductions

interval reductions [%]

interval width

20

22,497

3,906,474

99

32.89

Relations

30

5,206

3,937,230

99

31.83

under

50

2,180

3,935,723

99

28.42

3,940,200

uncertainty

100

125

3,930,277

99

23.97

200

4

3,909,386

99

19.76

1,000

2

3,619,402

92

12.11

20

1,021,750

/

/

32.89

Pareto

30

965,491

/

/

31.83

dominance

50

3,940,200

1,043,216

/

/

28.42

relations

100

894,889

/

/

23.97

200

733,044

/

/

19.76

1,000

379,928

/

/

12.11

67

Table 5: Comparison of the relations under uncertainty and Pareto dominance relations for RF modeling on the OSY problem Relation

Solutions used for

Number of

Incorrect

Number of comparisons with

Proportion of confidence

Confidence

type

surrogate model

comparisons

comparisons

confidence interval reductions

interval reductions [%]

interval width

20

1

2,663,597

68

842.41

Relations

30

0

2,663,597

68

789.04

under

50

3,940,200

0

2,663,597

68

767.92

uncertainty

100

0

2,663,597

68

720.44

200

0

2,663,597

68

677.79

1,000

0

2,663,597

68

548.19

20

885,416

/

/

842.41

Pareto

30

770,439

/

/

789.04

dominance

50

3,940,200

810,251

/

/

767.92

relations

100

683,578

/

/

720.44

200

661,919

/

/

677.79

1,000

555,983

/

/

548.19

Table 6: Comparison of the relations under uncertainty and Pareto dominance relations for the RF modeling on the SRN problem Relation

Solutions used for

Number of

Incorrect

Number of comparisons with

Proportion of confidence

Confidence

type

surrogate model

comparisons

comparisons

confidence interval reductions

interval reductions [%]

interval width

20

18

3,384,351

86

359.51

Relations

30

0

3,385,285

86

350.55

under

50

3,940,200

0

3,385,242

86

308.94

uncertainty

100

0

3,384,910

86

266.55

200

0

3,378,456

86

224.77

1,000

0

3,133,626

79

139.89

20

387,854

/

/

359.51

Pareto

30

425,691

/

/

350.55

dominance

50

3,940,200

365,606

/

/

308.94

relations

100

288,611

/

/

266.55

200

216,634

/

/

224.77

1,000

136,656

/

/

139.89

The solutions selected for testing were not generated ran-

lutions. Since building an RF surrogate model is faster than

domly, but rather produced by the well-known NSGA-II al-

building a GP surrogate model, we, in addition to building

gorithm [3]. This ensured that the solution comparisons were

surrogate models from 20, 30, 50, 100 and 200 exactly eval-

similar to the comparisons performed in evolutionary multi-

uated solutions, also built an RF surrogate model from 1000

objective algorithms and thus provided relevant results.

exactly evaluated solutions. We tested how much the larger

In every generation NSGA-II creates a new set of solu-

RF surrogate model built from 1000 exactly evaluated solu-

tions, adds them to the current ones and then performs selec-

tions increases the accuracy of the approximations.

tion on the union to identify the most promising ones. The

The NSGA-II parameter values used in the experiments

selection procedure includes comparing every solution with

were the same for both modeling techniques and for all three

all other solutions to determine its dominance status. These

problems. They were set as follows:

were the comparisons used in our study.

• population size: 100,

The experiments were performed on three benchmark mul-

tiobjective optimization problems. One is the Poloni opti-

• number of generations: 100,

mization problem [6] and two are from [2], called OSY and

SRN. All of them are two-objective problems.

• number of runs: 30.

For testing purposes we used GP modeling as proposed by

[7] and RF modeling as proposed in [1]. For the confidence

The results averaged over 30 runs are presented in Tables

interval width of the approximation we used two standard de-

1–3 (for GP modeling) and in Tables 4–6 (for RF modeling).

viations (2σ), which corresponds to about 95% of the normal

distribution of the approximations. The number of trees used

3

DISCUSSION

for building RF was 10,000 and the minimum number of ele-

ments in the leaves was set to 1.

The results gained with both modeling techniques show that,

To test the correlation between the surrogate model accu-

irrespectively of the accuracy of a surrogate model, using rela-

racy and the incorrect comparisons, different models of in-

tions under uncertainty reduces the number of incorrect com-

creasing accuracy were built—each on larger number of so-

parisons.

68

The comparison of the results gained with RF and GP re-

based multiobjective optimization. We compared their ap-

veals certain differences between the techniques. The main

proximation accuracy and width of the confidence intervals.

difference is in the width of the confidence intervals. RF sur-

The results show that surrogate models built with GP mod-

rogate models produce wider confidence intervals. Conse-

eling produce more accurate approximations with narrower

quently, the number of comparisons with confidence interval

confidence intervals. Due to narrower confidence intervals

reductions for RF is much higher than for GP.

the comparisons of solutions approximated with GP model-

In addition to yielding wider confidence intervals, the RF

ing require less additional exact solution evaluations. As a

surrogate models are also less accurate. Comparing the num-

result, we can conclude that GP modeling is more appropriate

ber of incorrect comparisons performed with Pareto domi-

for use in a surrogate-model-based algorithm than RF.

nance relations where the confidence intervals are not con-

sidered, we can see that the number of incorrect comparisons

References

is higher with the RF surrogate models.

Another difference is in the correlation between the num-

[1] L. Breiman.

Random forests.

Machine Learning,

ber of solutions used for building the surrogate model and the

45(1):5–32, 2001.

accuracy of the surrogate model. By increasing the number

[2] K. Deb. Multi-Objective Optimization Using Evolution-

of solutions used, the RF surrogate models do not improve

ary Algorithms. Wiley, New York, 2001.

as quickly as the GP models. Even in the cases where 1000

exactly evaluated solutions were used for building the RF sur-

[3] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. A

rogate models the confidence interval widths were not greatly

fast and elitist multiobjective genetic algorithm: NSGA-

reduced and the intervals were still much wider than the con-

II.

IEEE Transactions on Evolutionary Computation,

fidence intervals gained with GP models built from 200 solu-

6(2):182–197, 2002.

tions.

Looking at the number of incorrect comparisons, we can

[4] M. Mlakar, D. Petelin, T. Tušar, and B. Filipič.

GP-

see that by using relations under uncertainty with RF the re-

DEMO: Differential evolution for multiobjective opti-

sults are slightly better than with GP. The reason for that is in

mization based on Gaussian process models.

Euro-

the fact, that the approximations with RF have relatively wide

pean Journal of Operational Research, 2014,

doi:

confidence intervals which results in high number of confi-

10.1016/j.ejor.2014.04.011.

dence interval reductions. Therefore, most solutions have to

[5] M. Mlakar, T. Tušar, and B. Filipič. Comparing solu-

be exactly evaluated in order to perform the comparisons. So

tions under uncertainty in multiobjective optimization.

the reason for a lower number of incorrect comparisons is not

Mathematical Problems in Engineering, 2014,

doi:

the higher quality of the surrogate models, but in the fact that

10.1155/2014/817964.

more solutions are exactly evaluated and are therefore with-

out uncertainty. Since in surrogate-model-based optimization

[6] C. Poloni, A. Giurgevich, L. Onesti, and V. Pediroda. Hy-

exactly evaluated solutions are typically computationally ex-

bridization of a multi-objective genetic algorithm, a neu-

pensive, a modeling technique that exactly evaluates most of

ral network and a classical optimizer for a complex design

the solutions is not very useful.

problem in fluid dynamics. Computer Methods in Applied

Mechanics and Engineering, 186(2):403–420, 2000.

4

CONCLUSION

[7] C. E. Rasmussen and C. Williams. Gaussian Processes

In this paper we compared random forest and Gaussian pro-

for Machine Learning.

MIT Press, Cambridge, MA,

cess modeling techniques in the context of surrogate-model-

2006.

69

COMPREHENSIBILITY OF CLASSIFICATION TREES –

SURVEY DESIGN

Rok Piltaver1,2, Mitja Luštrek2, Matjaž Gams1,2, Sanda Martinč ić – Ipšić 3

Jožef Stefan Institute - Department of Intelligent Systems, Ljubljana, Slovenia 1

Jožef Stefan International Postgraduate School, Ljubljana, Slovenia 2

University of Rijeka - Department of Informatics, Rijeka, Croatia 3

rok.piltaver@ijs.si, mitja.lustrek@ijs.si, matjaz.gams@ijs.si, smarti@inf.uniri.hr

ABSTRACT

means for inducing definition of comprehensibility metrics

that

capture

fine-grained

differences

in

classifier

Comprehensibility is the decisive factor for application

comprehensibility and for evaluating the induced metrics.

of classifiers in practice. However, most algorithms that

User survey based approach, which follows the observation

learn comprehensible classifiers use classification model

that comprehensibility is in the eye of the beholder [16], is

size as a metric that guides the search in the space of all

advocated; defining comprehensibility metric directly is not

possible classifiers instead of comprehensibility - which

possible because it is comprehensibility is ill-defined [13].

is ill-defined. Several surveys have shown that such

simple complexity metrics do not correspond well to the

2 REVIEW OF RELATED WORK

comprehensibility of classification trees. This paper

therefore suggests a classification tree comprehensibility

According to [16] comprehensibility measures the "mental

survey

in

order

to

derive

an

exhaustive

fit" [15] of the classification model, which has two main

comprehensibility metrics better reflecting the human

drivers: the type of classification model and its size or

sense of classifier comprehensibility and obtain new

complexity. It is generally accepted that tree and rule based

insights about comprehensibility of classification trees.

models are the most comprehensible while SVM, ANN and

ensembles are in general black box models that can be

1 INTORDUCTION

hardly interpreted by users [8, 16, 20]; however there are

domain and user specific exceptions from this rule of thumb.

Comprehensibility of data mining models, also termed

For a given classification model, the comprehensibility

interpretability [15] or understandability [1], is the ability to

generally decreases with the size [2]. This principle is

understand the output of induction algorithm [14]. Its

motivated by Occam's razor, which prefers simpler models

importance has been stressed since the early days of

over more complex ones [6]. Furthermore, a rule based

machine learning research [17, 19]. Kodratoff even reports

model with few long clauses is harder to understand than

that it is the decisive factor when machine learning

one with shorter clauses, even if the models are of the same

approaches are applied in industry [13]. Application

absolute size [20]. Comprehensibility also decreases with

domains in which comprehensibility is emphasized are for

increasing number of variables and constants in a rule [20]

instance medicine, credit scoring, churn prediction,

and amount of inconsistency with existing domain

bioinformatics, and others [8].

knowledge [1, 18].

A metric of comprehensibility is therefore needed in

User-oriented assessment of classifier comprehensibility

order to compare learning systems performance and as a

[1] compared outputs of several tree and rule learning

(part of) heuristic function used by a learning algorithm [9,

algorithms

and

concluded

that

trees

are

more

21]. Majority of algorithms for learning comprehensible

comprehensible then rules, and that in some cases tree size

models use simple measures based on model size which may

is negatively correlated with comprehensibility. Note that

oversimplify the learned models. Humans by nature are

the trees included in the study were simple and were

mentally opposed to too simplistic representations of

probably perceived as less comprehensible because they did

complex relations [7], therefore it is no surprise that

not agree with the users’ knowledge. Another study [12]

empirical studies have shown comprehensibility to be

(based on inexperienced users) compared comprehensibility

negatively correlated with the complexity (size) of a

of decision tables, trees and rules. The results showed that

classifier in at least some cases [1]. Such simple measures

the respondents were able to answer the questions faster,

based on model complexity are therefore regarded as an

more accurately and more confidently using decision tables

over-simplistic notion of comprehensibility [8].

than using rules or trees and were clearly able to assess the

Those facts motivated us to propose a survey design, with

difficulty of the questions. Larger classifiers resulted in a

the goal to derive an exhaustive comprehensibility metrics

decrease in answer accuracy, an increase in answer time,

better

reflecting

the

human

sense

of

classifier

and a decrease in confidence in answers. Evidence that

comprehensibility. Obtained insights into evaluator’s

answering logical questions (e.g. validate a classifier) is

judgments about classifier comprehensibility will provide

70

considerably more difficult than classifying a new instance

validate the classifier, and discover new knowledge. Thus

was found. However, proposition that cognitive fit of

the second task - explain ask the respondent to answer

classifier with the given task type influences users’

which attributes values must be changed or retained in order

performance received limited support. A paper on

to classify a given instance into another class. For example,

comprehensibility of classification trees, rules, and tables,

which habits (values of attributes) would a patient with high

nearest neighbor and Bayesian network classifiers [8]

probability of getting cancer (class) have to change in order

stressed that graphical representation, hierarchical structure,

to stay healthy? The third task - validate requires the

including only subset of attributes in a tree, and

respondent to check whether a statement about the domain is

independence of tree branches are advantages of

confirmed or rejected according to the presented classifier.

classification trees. On the other hand, possible irrelevant

For example: does the tree say that persons smoking more

attributes and replicated subtrees enforced by the tree

than 15 cigarettes per day are likely to get cancer. Similar

structure decrease comprehensibility and may lead to

questions were also asked in [12]. The fourth task - discover

overfitting. This can be mitigated by converting a tree into a

asks the respondent to find a property (attribute-value pair)

rule set, which enables more flexible pruning resulting in a

that is unusual for instances from one class; this corresponds

more comprehensible representation. Another recognized

to finding a property of outliers. For example, people that

downside of classification trees is their Boolean logic-based

lead healthy life are not likely to get cancer, except if they

nature as opposed to the probabilistic interpretation of naïve

have already suffered from it in the past.

Bayes, which might be preferred in some applications [8].

The fifth task - rate requests the user to give the

This paper focuses on the comprehensibility of

subjective opinion about the classification trees on a scale

classification trees; however most of the suggested ideas

with five levels: very easy to comprehend, easy to

could be analogically implemented on classification rules

comprehend, comprehensible, difficult to comprehend, and

and tables as well. The survey design enables analysis of the

very difficult to comprehend. Each label of the scale is

influence of tree complexity and visualization on its

accompanied with an explanation that relates to the time

comprehensibility. The complexity of classification tree is

needed to comprehend the tree and difficulty of

usually measured with the number of leaves or nodes in a

remembering it and explaining it to another person. The

tree or the number of nodes per branch [16, 20] while the

purpose of explanations is to prevent variation in subjective

suggested survey considers some additional complexity

interpretations of the scale. The task intentionally follows

measures as well. The influence of visualization on

the first four tasks in which the respondents use the

comprehensibility has been stressed [16] but empirical

classifiers and obtain hands on experience, which enables

studies are missing, therefore the suggested survey also

them to rank the comprehensibility. The classifiers are

considers visualization factors. The past empirical studies of

learned on a single dataset and visualized using Orange tool

classifier comprehensibility [1, 12] were performed only on

[5] in order to be consistent across all the tasks and enable

homogenous groups of students, therefore we suggest

reliable and prompt responding. For the same reason

adding data mining experts with different cultural

meaningful attribute and class names are used. The first five

background to the group of participants in future studies.

tasks measure the influence of classifier complexity (i.e. the

number of leaves, depth, branching) while the final task

3 SURVEY DESIGN

measures the influence of different representations of the

same tree on the comprehensibility.

One possible way to estimate comprehensibility of a

Task six - compare asks the respondents to rate which of

classifier is to present it to a survey respondent, who will

the two classification trees shown side by side is more

analyze it, and then conduct an interview about

comprehensible on the scale with three levels: the tree is

comprehensibility. This approach is very time consuming

much

more

comprehensible,

the

tree

is

more

and may be unintentionally biased by both involved persons,

comprehensible, and the trees are equally comprehensible.

e.g. asking a question about comprehensibility of a model

One of the trees in this task is already used in the previous

may help the respondent in comprehending the classifier.

five tasks - serving as a known frame of reference - while

Therefore the indirect and more objective approach that was

the other one is a previously unseen tree with the same

also used in previous studies [1, 12] is preferred. It measures

content but represented in different style. The position of a

the performance of respondents asked to solve tasks that

tree (left or right) is randomized in order to prevent bias, e.g.

involve interpretation and understanding of classifiers. The

assuming that the left tree is always more comprehensible.

following subsections of the paper define the selected

survey tasks, performance metrics, observed properties of

3.2 Performance metrics

classifiers, and strategies that prevent bias.

The tasks rate and compare are directed toward obtaining

3.1 Survey tasks (question types)

subjective opinions rated on the given scales. The tasks

classify, explain, validate, and discover are directed toward

The comprehensibility survey consists of six tasks. The first

objectively quantifying respondents’ performance (e.g. time

task - classify asks respondent to classify an instance

and correctness of answers). Corresponding performance

according to a given classifier (same as in [1, 12]). Tasks 2-

metrics are derived from the six metrics proposed in the

4 are based on [4], which reports that comprehensibility is

experiments on conceptual model understandability [11].

required to explain individual instance classifications,

The first three are explicitly measured by the survey: the

71

time needed to understand a model translates to time to

using well-known machine learning algorithm rather than

answer a question (longer time - less comprehensible

manually constructed. Using different pruning parameters

classifier); correctly answering questions about the content

produces trees with different sizes. Higher branching factor

translates to the probability of correct answer (higher

can be achieved by replacing original binary attributes with

probability - more comprehensible classifier); the perceived

constructed attributes, which can be interpreted as building

ease of understanding is expressed with subjective judgment

deep models [20]. If possible, order of the leaves or at least

of a questions difficulty (rated on scale very easy, easy,

their grouping in subtrees should remain the same as in the

medium, difficult and very difficult). The other measures are

binary tree. Choosing a question for a given tree determines

implicitly embedded in the survey design: difficulty of

the number of nodes in a branch that the user will have to

recalling a model is captured through descriptions of the five

analyze in order to answer. In each group of questions a

levels of comprehensibility scale in the rate task; problem-

single parameter changes while the others remain constant.

solving based on the model content is embedded in tasks 1-

Finally, a well-known and comprehensible classifier

4; and verification of model content is in the validate task.

visualization style must be used, e.g. Orange [5].

Order of the question may also induce bias. For

3.3 Observed classifier properties

example, the learning effect can occur: the respondents need

Motivated by the related work [1, 8, 12, 20] and authors’

more time to answer the first few questions, after that they

experience the following tree complexity properties are

answer quicker. Next, the performance of respondents drops

proposed: number of leaves or nodes, branching factor,

if they get tired or loose motivation, therefore the number of

number of nodes in a branch, and number of instances

questions must be limited. To prevent those effects, Latin

belonging to a leaf. Proposed tree complexity properties are

square ordering is used, where each question occurs exactly

systematically varied in the first five tasks of the survey.

once at each place in the ordering and subsequently each

Also, the proposed tree visualization properties are varied

respondent gets a different ordering of the questions.

in the compare task: using color to enhance readability (e.g.

Finally, starting each task with a test question (from the

pie-charts corresponding to class distributions in nodes),

different domain) reduces the learning effect as well.

layout of the tree based on the depth of subtrees, and general

The survey design assumes the following order of

layout and readability of the visualized tree (e.g. plain text

tasks: starts with the simpler and progresses to more

output vs. default Weka [10] and Orange [5] visualization).

difficult ones. The compare and rate tasks, related to

Additionally, the survey enables contrasting: meaningful

subjective opinions, are placed toward the end – after the

names of attributes, attribute values and classes to

respondents acquire experience with the classifiers.

meaningless ones; attributes with high information gain to

Demographic data (DM knowledge, age, sex, language)

the ones with low gain; and meaningful aggregated

reflects the heterogeneity of the respondents group and

attributes contrasted to conjunctions of isolated attribute-

enables detailed analysis of classifier comprehensibility per

value pairs (i.e. deep structure [20]). Finally, the survey

different subgroups like students or experts. Hence, the test

design also enables various statistical analysis for the each

group consists of data mining experts on one hand and non-

single leaf (branch of the tree) or for the entire tree.

experts with basic knowledge about classification on the

3.4 Avoiding implicit survey bias

other. Comparing the results of the two groups as well as

considering the cultural background (e.g. different mother

In order to prevent bias the following issues must be

tongues), can provide new insights into classifier

considered: choice of the classification domain, classifiers,

comprehensibility. Finally, obtaining statistically significant

and respondents group, and the ordering of questions. The

results requires high enough number of respondents.

classification domain has to be familiar to respondents - all

of them are aware of relations among attribute values and

4 SURVEY IMPLEMENTATION

classes and none of them have significant advantage of more

in-depth knowledge about the domain. At the same time, the

This work proposes online survey in order to facilitate

domain must be broad and rich enough to enable learning a

accurate measurements of time, automatic checking the

range of classifiers with various properties listed in 3.3.

correctness of answers, saving the answers in a database and

Furthermore, choosing an interesting domain motivates the

allowing remote participation. Several tools for designing

respondents to participate in the survey. The Zoo domain

and performing online surveys exist but do not meet all of

from the UCI Machine Learning Repository [3] meets all the

the design requirements (see section 3): Latin square design,

requirements and is highly appropriate for general and

measuring the time of answering each question, automatic

heterogeneous population. It requires only elementary

translation to several languages, using templates to quickly

knowledge about animals expressed with 17 (mostly binary)

define questions for a given task and automatically checking

attributes: are they aquatic or airborne, do they breathe, how

the correctness of answers. Therefore, custom online survey

many legs they have, do they have teeth, fins or feathers,

is implemented using MySQL database, PHP and JavaScript

etc. The Zoo domain induces 7 classes: mammals, fish,

programming languages, and CSS for webpage formatting.

birds, amphibian, reptile, mollusk, and insect.

The database includes one table for demographic data

The selected classifiers must vary in complexity but not

with auto-increment user id as the primary key and one table

in other parameters that may influence comprehensibility

per task with user id and question id as the primary key.

and hence bias the results. In addition, classifiers are learned

Each task table includes a field representing question order

72

number, a date-time field, and field(s) representing the

[5] J. Demšar, T. Curk, A. Erjavec. Orange: Data Mining

respondents’ answer. Tables for tasks 1-4 additionally

Toolbox in Python. Journal of Machine Learning

include fields with the measured answering time, list of all

Research, 14 (Aug), pp. 2349−2353, 2013.

respondent clicks and associated times, and the indicator of

[6] P. Domingos. The role of occam's razor in knowledge

correct answer. PHP is used to dynamically generate

discovery, Data Mining and Knowledge Discovery, 3,

survey webpages with correct ordering of questions for

pp. 409–425, 1999.

each respondent and storing the answers into the database.

[7] T. Elomaa. In Defense of C4.5: Notes on learning one-

Question webpages are generated by a separate PHP script

level decision trees. Proceedings of 11th Int. Conf. on

for each task based on a template and a simple data structure

ML, pp. 62-69, 1994.

defining the questions. An additional PHP script is used as a

[8] A. A. Freitas. Comprehensible classification models - a

library of shared functions and data structures: one

position paper. ACM SIGKDD Explorations, 15 (1), pp.

represents instances used in the survey and the other terms

1-10, 2013.

(instructions, attribute names and value, classes, etc.)

[9] C. Giraud-Carrier. Beyond predictive accuracy: what?

translated into English, Slovenian and Croatian languages.

Proceedings of the ECML-98 Workshop on Upgrading

Additionally, PHP scripts are used for backing-up and

Learning to Meta-Level: Model Selection and Data

checking correctness of answers, login and help pages, and a

Transformation, pp. 78-85, 1998.

respondent home-page providing feedback on personal

[10] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P.

progress and performance compared to the group. SVG

Reutemann, I. H. Witten. The WEKA Data Mining

images representing the classification trees exported from

Software: An Update. SIGKDD Explorations, 11 (1),

Orange [5] were automatically translated into the three

2009.

languages using a Java program – the translation table is the

[11] C. Houy, P. Fettke, P. Loos. Understanding

same as in the PHP library script.

Understandability of Conceptual Models – What Are

JavaScript is used to measure the time of answering

We Actually Talking about? Conceptual Modeling -

each question. When a webpage is opened, only the

Lecture Notes in Comp. Sc. vol. 7532, pp. 64-77, 2012.

instructions and footer of the page are visible. Clicking on

[12] J. Huysmans, K. Dejaeger, C. Mues, J. Vanthienen, B.

the button “Start solving” calls a JavaScript function that

Baesens.

An

empirical

evaluation

of

the

displays the question (e.g. table with attribute-value pairs

comprehensibility of decision table, tree and rule based

and image of a tree) and the answer form (drop-down lists,

predictive models. Decision Support Systems, 51 (1),

radio buttons) and starts the timer. Changing a value of the

pp. 141-154, 2011.

answer form field records the relative time and action type.

[13] Y. Kodratoff, The comprehensibility manifesto, KDD

When the respondent clicks the “Finish button” , the answer

Nuggets (94:9), 1994.

fields are disabled, time is calculated, and question difficulty

[14] R. Kohavi. Scaling Up the Accuracy of Naive-Bayes

rating options are displayed. When the “Next button” is

Classifiers: a Decision-Tree Hybrid. Proceedings of the

clicked, the collected values are assigned to hidden form

2nd Int. Conf. on KD and DM, pp. 202-207, 1996.

fields in order to pass them to the PHP script that stores the

[15] O.

O.

Maimon,

L.

Rokach,

Decomposition

data in the database and displays the next question.

Methodology for Knowledge Discovery and Data

A psychologist and two DM experts analyzed the initial

Mining: Theory and Applications, World Scientific

survey and improved version was implemented based on

Publishing Company, 2005.

their comments. It passed a validation test with 15 students

[16] D. Martens, J. Vanthienen, W. Verbeke, B. Baesens.

answering the first task at the same time. Preliminary

Performance of classification models from a user

analysis of the results for 10 respondents is in line with the

perspective. Decision Support Systems, 51 (4), pp. 782-

expectations, thus the survey is ready to be used in order to

793, 2011.

collect data about tree comprehensibility.

[17] R. Michalski, A theory and methodology of inductive

learning, Artificial Intelligence 20, pp. 111–161, 1983.

References:

[18] M. Pazzani. Influence of prior knowledge on concept

acquisition: experimental and computational results.

[1] H. Allahyari, N. Lavesson, User-oriented Assessment of

Journal of Experimental Psychology. Learning,

Classification

Model

Understandability,

11th

Memory, and Cognition 17, pp. 416–432, 1991.

Scandinavian Conf. on AI, pp. 11-19, 2011.

[19] Quinlan, J.R. Some elements of machine learning. Proc.

[2] I. Askira-Gelman, Knowledge discovery: comprehe-

16th Int. Conf. on Machine Learning (ICML-99), pp.

nsibility of the results. Proceedings of the 31st Annual

523-525, 1999.

Hawaii Int. Conf. on System Sciences, 5, pp. 247, 1998.

[20] E. Sommer. An approach to quantifying the quality of

[3] K. Bache, M. Lichman. UCI Machine Learning

induced theories. Proceedings of the IJCAI Workshop

Repository, http://archive.ics.uci.edu/ml. University of

on Machine Learning and Comprehensibility, 1995.

California, School of Inf. and Comp. Science, 2014.

[21] Z.-H. Zhou. Comprehensibility of data mining

[4] M.

W.

Craven,

J.

W.

Shavlik.

Extracting

algorithms. Encyclopedia of Data Warehousing and

Comprehensible Concept Representations from Trained

Mining, pp. 190-195, Hershey, 2005.

Neural Networks. Working Notes on the IJCAI’95 WS

on Comprehensibility in ML, pp. 61-75, 1995.

73

PAMETNO VODENJE SISTEMOV V STAVBAH S STROJNIM

UČENJEM IN VEČKRITERIJSKO OPTIMIZACIJO

Rok Piltaver, Tea Tušar, Aleš Tavč ar, Nejc Ambrožič , Tomaž Šef, Matjaž Gams, Bogdan Filipič

Institut “Jožef Stefan”, Odsek za inteligentne sisteme

Jamova cesta 39, 1000 Ljubljana, Slovenija

e-mail: {rok.piltaver, tea.tusar, ales.tavcar, tomaz.sef, matjaz.gams, bogdan.filipic}@ijs.si

ABSTRACT

potenciala, ki jih sistemi hišne avtomatizacije omogočajo.

Zato v [4] predlagajo uporabo tehnik strojnega učenja za

Prispevek opisuje programsko opremo za pametno in

prepoznavanje navad uporabnikov in gradnjo napovednih

celovito vodenje sistemov v stavbi, kot so ogrevanje,

modelov njihovega obnašanja ter uporabo večkriterijske

prezračevanje, senčenje, razsvetljava in upravljanje z

optimizacije za zagotavljanje ustreznega upravljanju

viri energije. Cilj je zagotoviti čim nižje stroške in hkrati

inteligentnega doma, ki zadovoljuje nasprotujoče si kriterije.

čim višje udobje za stanovalce. Sistem pametne stavbe

Pričujoči prispevek v 2. razdelku opisuje delovanje

pridobi podatke s senzorjev, nameščenih v stavbi, in se iz

sistema OpUS, ki implementira predlagane rešitve za

njih nauči navad in akcij uporabnikov v preteklem

pametno vodenje sistemov v stavbah na podlagi učenja in

obdobju. V drugem koraku uporabi večkriterijsko

večkriterijkse optimizacije. Rezultati delovanja sistema

optimizacijo, ki na podlagi simulacij išče najboljše

OpUS so predstavljeni na primeru uporabe v 3. razdelku

nastavitve parametrov za vodenje sistemov v stavbi.

Prispevek se zaključi z razpravo v 4. razdelku.

Uporabniku

se

najboljše

nastavitve

parametrov

prikažejo v oblik urnikov. Za vsak urnik sta dana dva

2 SISTEM OPUS

podatka, udobje in cena, na podlagi katerih uporabnik

izbere najprimernejši urnik in s tem na preprost način

Programska oprema sistema OpUS, prikazana na sliki 1, je

nastavi parametre za avtomatizacijo sistemov v stavbi, ki

razdeljena v štiri sklope: beli kvadrati predstavljajo

zagotovijo želeni kompromis med udobjem in stroški.

vhodno/izhodne

module,

modra

kvadrata

ustrezata

moduloma za učenje, zelena modulu za optimizacijo in

1 UVOD

oranžna moduloma za simulacijo. Številke predstavljajo

zaporedje toka podatkov skozi sistem od vhodnih senzorskih

Bivalni objekti v Evropi so leta 2004 porabili 37% vse

podatkov (1) do parametrov za avtomatizacijo sistemov v

porabljene energije [3], v Združenih državah Amerike pa je

stavbi (10). Vsebina podatkovnih tokov in delovanje

bil v letu 2010 ta delež kar 41% [2]. Iskanje strategij za

posameznih modulov sta opisana v nadaljevanju.

zmanjšanje porabe energije je torej ena izmed ključnih nalog



sodobne družbe in tema številnih raziskav, ki se ukvarjajo z

razvojem učinkovitih metod vodenja naprav, ki porabijo

2.1 Pridobivanje senzorskih podatkov

veliko energije. Sistemi za ogrevanje, hlajenje in

Obstoječi sistemi za hišno avtomatizacijo ponujajo široko

prezračevanje prostorov npr. porabijo 50% vse energije, ki jo

paleto senzorjev: od senzorjev gibanja, temperature, vlažnosti

stanovanjske hiše potrebujejo za obratovanje [3]. Dobre

in kakovosti zraka, osvetljenosti, pretoka vode in porabe

strategije morajo ustrezno obravnavati nasprotujoče si

električne energije do pametnih stikal in podatkov o

zahteve uporabnikov, kot sta npr. sočasno doseganje

delovanju posameznih naprav. Poleg tega omogočajo tudi

energetske varčnosti in visoke stopnje ugodja.

zbiranje, shranjevanje in posredovanje senzorskih podatkov

Zmanjševanje

stroškov

shranjevanja

in

obdelave

zunanjim sistemom (slika 1, točka 1). Sistem OpUS uporablja

podatkov, dostopnost senzorjev in aktuatorjev ter enostavno

modul za pridobivanje senzorskih podatkov, ki mora biti

povezovanje različnih naprav v skupen sistem omogočajo

prilagojen protokolu komunikacij in formatu podatkov, ki ga

uporabo kompleksnih metod vodenja tudi v manjših bivalnih

podpira sistem hišne avtomatizacije – to omogoča

enotah. Obstoječi sistemi pametnih hiš sicer omogočajo

prilagoditev sistema OpUS različnim sistemom hišne

avtomatizacijo delovanja sistemov v stavbah po vnaprej

avtomatizacije. Pridobljeni senzorski podatki se pretvorijo v

nastavljenih urnikih, preklapljanje med načini delovanja

poenoteno obliko, ki ob vsaki spremembi shrani čas, tip

glede na zaznano prisotnost uporabnikov ali na zahtevo

senzorja (določa mersko enoto, natančnost in frekvenco

uporabnika preko spletnega vmesnika. Vendar večini

meritev ipd.) in identifikacijo senzorja (določa lokacijo

uporabnikov ne uspe nastaviti primernega urnika za

senzorja in povezavo z zabeleženimi preteklimi vrednostmi)

avtomatizacijo, saj morajo pri tem nastaviti veliko pogosto

ter novo vrednost. Poenoteni podatki se shranijo v

nerazumljivih parametrov in upoštevati nenehne spremembe

podatkovno bazo za kasnejše analize in prikaz uporabniku ter

svojih potreb in zunanjih vplivov, kot so vreme in cene

se na zahtevo posredujejo moduloma za učenje (slika 1, točka

energentov. Poleg tega take rešitve ne izrabijo celotnega

2).

74





2.2 Modula za učenje

zanj ni dovolj udobno: npr. previsoka temperatura, slaba

Pretekle raziskave so pokazale, da lahko z uporabo podatkov

osvetljenost ali kakovost zraka. Zbrane podatke modul

o prisotnosti uporabnikov in njihovih akcijah napovemo

analizira v kontekstu časa, prostora in prepoznane aktivnosti.

prisotnost ali odsotnost uporabnikov ter njihove navade z

Če zazna, da uporabnik pri določeni aktivnosti v določenem

relativno visoko točnostjo [1]. Na tej osnovi sta bila razvita

prostoru večkrat izvede enako akcijo, iz tega sklepa, da je

modula za učenje navad in akcij, opisana v nadaljevanju.

nastavitev scene (opisana v [5]) za to aktivnost in prostor

Modul za učenje navad periodično zahteva časovno okno

neprimerna ter predlaga njeno spremembo.

podatkov, iz katerih prepozna prisotnost in odsotnost

Oba modula poleg specifičnih metod za prepoznavanje

uporabnikov ter njihove aktivnosti: spanje, pripravo obroka

akcij in aktivnosti uporabljata standardne algoritme strojnega

in prehranjevanje, uporabo kopalnice ipd. Prisotnost

učenja, da zgradita model, ki uporabniku zagotavljajo udobje.

uporabnika v določenem prostoru prepozna neposredno iz

Model vsebuje podatke o tem, kakšna je verjetnost, da

podatkov o uporabi stikal v prostoru in zaznavah senzorjev

uporabnik na določen dan v tednu ob določenem času

gibanja, ostale aktivnosti pa s pomočjo zlivanja senzorskih

potrebuje neko sceno (povezano z aktivnostjo uporabnika), in

podatkov in uporabo konteksta: čas, prostor in predhodne

kake vrednosti parametrov naj bodo nastavljene za

aktivnosti. Npr. prižgana luč v kopalnici in 7 minut pretoka

posamezno sceno (temperatura zraka, osvetljenost, zaprta

tople vode sovpadata z aktivnostjo uporabe kopalnice;

okna idr.) [5]. Model se posreduje modulu za optimizacijo,

ugasnjene ali zatemnjene luči, odsotnost gibanja ter drugih

kot prikazuje slika 1, korak 3.

akcij uporabnikov ob podatku, da oseba ni zapustila stavbe

2.3 Modul za optimizacijo

ter da je ura 4 zjutraj, sovpadajo z aktivnostjo spanje.

Cilj optimizacije je poiskati nedominirane (t.j. najboljše)

Prepoznavanje aktivnosti je pomembno, ker določa okoljske

urnike po kriterijih udobja in cene – namesto te se lahko

parametre, ki so ob določeni aktivnosti za uporabnika udobni:

uporablja tudi količina porabljenih energentov ali količina

v času aktivne prisotnosti mora biti temperatura v stavbi

posledično izpuščenega CO

primerna, zrak svež in ne presuh ali preveč vlažen; v času

2. Ker sta si kriterija udobje in

cena nasprotujoča, je izhod postopka optimizacije množica

spanja so lahko temperatura, osvetljenost in zaloga tople vode

urnikov, ki so med sabo neprimerljivi (boljši v enem kriteriju

nižje; v času odsotnosti temperatura in osvetljenost nista

in slabši v drugem) in boljši od vseh ostalih urnikov. Urnik je

pomembni, okna pa morajo biti zaprta. Samodejno učenje

predstavljen kot zaporedje 15-minutnih časovnih intervalov

spreminjajočih se uporabnikovih navad odpravi potrebo po

za katere je treba določiti parametre vodenja posameznih

ročnem (po)nastavljanju urnikov za avtomatizacijo sistemov

sistemov v stavbi. Za iskanje nedominiranih urnikov se

v stavbi ter hkrati omogoči boljše nastavitve, ki temeljijo na

uporablja algoritem večkriterijske optimizacije, ki podatka o

natančnih statističnih podatkih o pretekli uporabi.

ceni in udobju urnika pridobi od modula za simulacijo (slika

Modul za učenje akcij periodično zahteva podatke o

1, korak 6).

akcijah uporabnika, ki jih le-ta izvede, kadar okolje v stavbi





Slika 1: Programski moduli sistema OpUS in podatkovni tokovi med njimi

75

2.4 Simulacija

3 PRIMER UPORABE

Modul za simulacijo na vhodu sprejme (slika 1, korak 5)

V tem razdelku predstavljamo rezultate pametnega vodenja

model stavbe, ki npr. določa toplotne izgube, porabo energije

na primeru stavbe s fotovoltaičnimi paneli, kjer lahko

posameznih sistemov ipd.; cene energentov, ki omogočijo

določamo polnjenje in praznjenje baterije ter delovanje

izračun stroškov določenega urnika; vremensko napoved, ki

nekaterih porabnikov, medtem ko so scene omejene na

določa pričakovane zunanje vplive na pogoje v stavbi; in

nastavljanje želene temperature v stavbi. Stavbo modeliramo

model uporabnikovih navad, ki je osnova za izračun udobja

s simulatorjem, ki za dani urnik vrača njegove stroške in

danega urnika. Simulator je osnovan na obstoječih splošnih

udobje. V stroških upoštevamo tudi neporabljeno energijo v

simulatorjih delovanja sistemov v stavbah, lastnostih stavbe

bateriji, ki predstavlja prihodnji dobiček. Optimizacijo

in konkretnih sistemov, prisotnih v stavbi ter metodi za

izvajamo z evolucijskim večkriterijskim optimizacijskim

izračun neudobja uporabnika glede na okoljske pogoje v

algoritmom, ki poišče kompromisne urnike glede na

stavbi in želene pogoje. Rezultati simulacije, ki so odvisni od

obravnavana nasprotujoča si kriterija.

točnosti simulatorja in vhodnih podatkov, se vrnejo modulu

Optimizacijski algoritem kot vhodne podatke uporabi

za optimizacijo (slika 1, korak 6), ki na njihovi podlagi

podani začetni (neoptimirani) urnik stavbe in podatke o ceni

predlaga nove urnike (slika 1, korak 4). Po končani

energije, napovedi sončne energije, porabnikih in navadah

optimizaciji se izbrani urnik in pripadajoče udobje ter cena

uporabnikov. Dokler ni izpolnjen ustavitveni pogoj (čas, ki je

posredujejo uporabniškem vmesniku (slika 1, korak 7).

na voljo za optimizacijo), poteka preiskovanje prostora

2.5 Uporabniški vmesnik

urnikov in njihovo vrednotenje preko omenjene simulacije.

Vsak urnik opisuje delovanje stavbe med podanima

Uporabniški vmesnik ponuja vizualizacijo, iz katere sta

začetnim in končnim časom, pri čemer je vmesno obdobje

razvidna cena in udobje najboljših urnikov. Uporabniku se ob

razdeljeno na 15-minutne intervale. Urnik je sestavljen iz

izbiri posameznega ponujenega urnika prikažejo razlike v

naslednjih štirih komponent:

trajanju in nastavitvah scen ter udobju in ceni med izbranim

in trenutno nastavljenim urnikom. Množica najboljših

•

Temperatura: Za vsak interval določimo želeno

(nedominiranih) rešitev omogoča, da uporabnik dobi vso

temperaturo v stopinjah Celzija, ki mora zadoščati

informacijo o delovanju sistema in se na podlagi te

omejitvam (biti mora vsebovana v [Tmin, Tmax], kjer sta

informacije odloči, kateri kriterijje zanj pomembnejši ter

temperaturi Tmin in Tmax lahko podani za vsak interval

kakšen kompromis med kriterijema mu bolj ustreza.

posebej).

Uporabnik lahko v koraku 8 (slika 1) izbere enega od

•

Energija+: V primeru, da imamo presežek energije

predlaganih urnikov ter ga po potrebi prilagodi svojim željam

(fotovoltaični paneli proizvedejo več energije, kot je

– v tem primeru se ponovno izvede optimizacija izvajanja

hiša porabi), za vsak interval določimo delovanje

urnika in simulacija za oceno cene in udobja predlaganih

baterije. Možni sta le dve vrednosti, in sicer 1 (baterija

sprememb urnika.

naj se polni) in 0 (baterija naj se ne polni). Če se baterija

Sistem OpUS začne delovati z vnaprej nastavljenim

ne polni, presežek energije prodajamo.

urnikom, ki je dober približek splošno uporabnega urnika.

•

Energija– : V primeru, da imamo primanjkljaj energije

Skozi čas se sistem nauči navad in potreb uporabnika ter

(fotovoltaični paneli proizvedejo manj energije, kot je

predlaga boljše urnike. Izboljšan urnik je primeren za

hiša porabi), za vsak interval določimo delovanje

uporabo, dokler ne pride do sprememb navad uporabnikov ali

baterije. Možni sta le dve vrednosti, in sicer 1 (baterija

do spremembe zunanjih vplivov: vremena kot posledice

naj se prazni) in 0 (baterija naj se ne prazni). Če se

letnih časov ali bistvene spremembe cen energentov na trgu.

baterija ne prazni, potrebno energijo črpamo iz omrežja.

Poleg izbire in primerjave urnikov uporabniški vmesnik

•

Porabniki: Za vsakega porabnika določimo čas, ko naj

ponuja tudi pregled nad preteklo porabo in skladnostjo

začne delovati, t.j. porabljati energijo. Končni čas in

izbranega urnika s prepoznanimi potrebami uporabnika ter

količina porabljene energije se izračunata iz lastnosti

ročno upravljanje s sistemom za hišno avtomatizacijo.

porabnika.

2.6 Vodenje sistemov v stavbi

Optimizacijo smo preizkusili na naslednjem konkretnem

Izbrani urnik in pripadajoči parametri za vodenje sistemov v

primeru. Želimo optimirati vodenje stavbe s fotovoltaičnimi

stavbi se iz uporabniškega vmesnika pošljejo modulu za

paneli, eno baterijo in enim porabnikom, ki mora delovati

pošiljanje navodil za avtomatizacijo (slika 1, korak 9). Le-ta

enkrat dnevno. Zanima nas vodenje stavbe za dva naslednja

je izhodni modul, ki mora biti prilagojen konkretnemu

dneva: v prvem je napovedano jasno (sončno) vreme, v

sistemu hišne avtomatizacije podobno kot modul za

drugem pa pretežno oblačno vreme. Začetni urnik je določen

pridobivanje senzorskih podatkov. Poleg določenega urnika

na podlagi vremenske napovedi in uporabnikovih navad ter

modul sprejme tudi podatke o zaznanih aktivnostih

želenih temperatur. Optimizacijo izvajamo, dokler ne

uporabnikov, ki jih prepozna modul za učenje navad, na

pregledamo 1000 urnikov.

podlagi katerih preklaplja med scenami, kadar se pričakovana

Slika 2 predstavlja rezultate tega poskusa. Zeleni krožci

aktivnost na urniku ne ujema z zaznano aktivnostjo (slika 1,

prikazujejo vse generirane urnike. Začetni urnik je obarvan

korak 10).

rdeče, nedominirani urniki (vseh je deset) pa so predstavljeni

76





Slika 2: Vsi urniki dobljeni po postopku več kriterijske

optimizacije

z modrimi pikami. Kot lahko vidimo, optimizacijski

algoritem najde različne kompromise med stroški in

udobjem, ki so po obeh kriterijih boljši od začetnega urnika.

Najugodnejši dobljeni urnik je podrobneje obrazložen v

nadaljevanju.

Slika 3 prikazuje dva grafa s podrobnejšo informacijo o

najugodnejšem dobljenem urniku. Siva območja označujejo

obdobja, ko uporabnik ni prisoten v stavbi. Takrat se meri

samo poraba energije, ne pa tudi udobje. Zgornji graf na sliki

3 prikazuje ciljno temperaturo (tisto, ki ustreza največjemu

Slika 3: Podrobnosti najugodnejšega urnika. Siva območ ja

udobju), nastavljeno temperaturo ter dejansko temperaturo, ki

označ ujejo obdobja, ko uporabnik ni prisoten v stavbi.

jo izmeri simulator, ko stavba poskuša voditi sisteme gretja

in ohlajanja tako, da se čim bolj približa nastavljeni

udobnejši in cenejši od ročno nastavljenih urnikov, saj se

temperaturi. Vidimo, da je razkorak med nastavljeno in

lahko sproti prilagajajo zunanjim vplivom in potrebam

dejansko temperaturo precejšen predvsem v obdobjih

uporabnikov. Uporaba takega sistema odpravi tudi potrebo po

neprisotnosti, ko se stavba ne hladi oz. ogreva. Spodnji graf

ročnem nastavljanju urnikov in hkrati spodbuja energetsko

kaže, kaj se v določenem intervalu dogaja z energijo. Baterija

varčnost uporabnikov.

se včasih polni (pozitivna energija) in včasih prazni

(negativna energija), energijo prodajamo (pozitivna energija)

LITERATURA

in kupujemo (negativna energija), vidimo tudi, v katerih

[1] Gjoreski M., Gjoreski H., Piltaver R., Gams M.

intervalih obratuje porabnik. Na grafu je označena tudi tarifa

"Predicting the arrival and the departure time of an

kupovanja energije, ki je lahko bodisi visoka bodisi nizka. V

employee." Zbornik 16. mednarodne multikonference

prvem, sončnem dnevu, fotovoltaični paneli proizvedejo

Informacijska družba – IS 2013, str. 43–46.

veliko energije, ki se deloma shrani v baterijo, deloma pa

[2] Annual Energy Outlook 2012. U.S. Energy Information

proda. V drugem dnevu je takšne energije zelo malo. Večina

Administration (EIA), 2012.

energije, ki se je shranila v baterijo, ostaja v bateriji tudi po

[3] Perez-Lombard L., Ortiz J. in Pout C. "A review on

koncu urnika (za porabo v prihodnjem obdobju).

buildings energy consumption information." Energy and

Buildings, 40 (3): 394–398, 2008.

4 RAZPRAVA

[4] Šef T., Piltaver R., Tušar T. "Projekt OpUS: optimizacija

Prispevek opisuje arhitekturo programske opreme, ki s

upravljanja energetsko učinkovitih pametnih stavb."

pametnim vodenje sistemov v stavbi rešuje pereč problem

Zbornik 16. mednarodne multikonference Informacijska

zagotavljanja visoke stopnje udobja in hkrati nizkih stroškov.

družba – IS 2013, str. 110–113.

Arhitektura temelji na ideji uporabe strojnega učenja

[5] Tavčar A., Piltaver R., Zupančič D., Šef T., Gams M.

uporabnikovih navad in potreb ter večkriterijske optimizacije

"Modeliranje navad uporabnikov pri vodenju pametnih

parametrov vodenja na podlagi simulacije. Poleg arhitekture,

hiš."

Zbornik 16. mednarodne multikonference

ki omogoča vključitev v obstoječe sisteme pametnih stavb, so

Informacijska družba – IS 2013, str. 114–117.

predlagane tudi formalne predstavitev problemov učenja,

optimizacije in simulacije ter algoritmi, ki so primerni za

reševanje teh problemov. Primer uporabe predlaganih rešitev

kaže, da je tak sistem sposoben predlagati urnike, ki so

77



DETERMINATION OF CLASSIFICATION

PARAMETERS OF BARLEY SEEDS MIXED WITH WHEAT

SEEDS BY USING ANN





Kadir Sabancı1, Cevat Aydın2

1 Department of Electrical and Electronics Engineering, Batman University, Batman, Turkey

2 Department of Agricultural Machinery, Selçuk University, Konya, Turkey

Tel: +904882173500; fax: +904882173601

e-mail: kadir.sabanci@batman.edu.tr





ABSTRACT





One of the basic problems that cause loss of yield in



wheat is weed seeds that mixed with wheat seeds. In this

performance characteristics to biological neural networks

study, discrimination of barley seed which mixed with

[8]. Simply, ANN that imitates the function of the human

wheat seeds has been realized. Classification of wheat

brain has several important features such as learning from

and barley seeds has been achieved by using artificial

data, generalizing, working with an unlimited number of

neural network and image processing techniques. In the

variables etc.

study, image processing techniques and the use of

It is seen that ANN is used in crop production, which

artificial neural network have been made possible with

constitutes an important field of agriculture engineering, in

Matlab software. By using Otsu method, histogram data

identification and classification stages of a wide range of

of seed images that were taken from web camera was

agricultural products such as grape, wheat, peppers and

obtained. By using histogram data, with multi-layered

olives [9].

artificial neural network model, the system was

In this study, a software has been developed for

educated and classification was made. Besides, wheat

distinguishing the wheat and barley seeds which has been

and barley seeds in the picture info where mixed seeds

mixed during harvesting. Wheat and barley seeds which has

taken from the web camera exist were counted.

been mixed, have been attempted to distinguish by using



image processing techniques and artificial neural networks.



Multilayer artificial neural networks has been performed for

1 INTRODUCTION

the process to be more precise and faster. System has been



trained by using barley and wheat seeds pictures. Wheat and

Quality is one of the important factors in agricultural

barley seeds have been classified successfully by using

products marketing. Grading machines have great role in

improved system. This study exemplifies image processing

quality control systems. The most efficient method used in

and artificial neural networks in agriculture.

grading machines today is image processing.



Digitisation of the image is the process in which the image in



the camera is converted to electical signals with optical –

2 MATERIAL METHOD

electrical mechanism [1].



Image processing, as a general term, is manipulation and

In this study, image of wheat and barley seeds photos have

analysis of the pictorial information [2].

been taken by using a webcam with 1.3 MP (Mega Pixels)

Image processing techniques are used in different areas such

and having CCD sensor. Usage of image processing and

as industry, security, geology, medicine, agriculture. Image

artificial neural networks are provided by Matlab. In this

processing and artificial neural networks are used in

study, 50 wheat seeds, 50 barley seeds were used. Black

agriculture in fruit color analysis and classification, root

background is used at the stage of image processing for

growth monitoring, measurement of leaf area, determination

faster and correct results.

of weeds [3,4,5,6,7].

Firstly, wheat and barley seeds image information was

Artificial neural networks is an information processing

received to obtain image informations that was to enter to

system which have been exposed with inspiration of

Artificial neural networks. Picture information of wheat and

biological neural networks and includes some similar

barley seeds are shown in Figure 1.



78





Figure 3: Binary image information belong to wheat and

Figure 1: Wheat and barley seeds

barley seeds





Wheat and barley seeds image information was converted



to gray level images. Filtration was performed to pictures

In this study, Matlab Software’s Artificial Neural Network

tor reduce noise and interference. Wheat and barley seeds

toolbox were used to distinguish wheat and barley seeds.

pictures which were converted into gray levels are shown in

ANN 's main tasks are to learn structure in the model data

Figure 2.

set, to make generalizations in order to fulfill to required



task. To make this, the network is trained with the samples of

related event to make generalization. Multi-layered artificial

neural networks are the most commonly used in ANN

models.





Figure 2: Gray level images belong to wheat and barley

seeds



Image information which is at gray level were converted to

black and white picture by using Otsu method. Otsu

algorithm provides the clustering of these pixels according to

the distribution of pixel values in the image. Thresholding

process is one of the important processes in image

processing. Especially, this method is used for highlighting





closed and discrete areas of the object in the image. It

Figure 4: Multi-layered artificial neural network

includes the arrangement of image which was divided into



pixels until to the image in dual structure. Simply,

In the study, neural network model with multilayer,

thresholding process is a process of discarding pixel values

feedforward, back propagation was used. Multilayer

on the image according to specific values, and replacing

Perceptron (MLP) networks are a feedforward neural

other value / values. Thus determination of object lines and

network model which has different number of neurons in the

backgrounds of the object on the image were provided [10].

input layer, an intermediate layer consisting of one or more

Threshold value is determined by using Otsu method. if it is

layers(s) and consisting of output layer. The structure of

under this value, pixels are converted to 0 value; if it is over

MLP neural network is shown in Figure 4. MLP neural

this value pixels are converted to 1 value. Wheat and barley

network outputs of the neurons in a layer are connected to all

seeds pictures in black and white pictures are shown in

input of the neurons with weights. The number of neurons in

Figure 3.

the input and output layer is determined according to the



implementation problems. The number of intermediate



layers, number of neurons in the intermediate layer and

activation function are determined by the designer by trial

and error method [11].



79

Segmentation process was performed by using digital image

References

processing techniques on images belonging to mixed wheat



and barley seeds and by determining the place of each seeds

[1] Yaman, K., 2000. Görüntü işleme yönteminin Ankara

on the picture. Each pistachios was cropped in 100x100

hızlı

raylı

ulaşım

sistemi

güzergahında

sefer

pixels size. First of all, digital images of each seeds were

aralıklarının

optimizasyonuna

yönelik

olarak

converted to gray level images.

incelenmesi. Yayınlanmamış Yüksek Lisans Tezi, Gazi

Picture was filtered in order to remove noise and very small

Üniversitesi, Fen Bilimleri Enstitüsü.

objects (dust, etc..). Noise removed gray level pictures was

[2] Castelman, R. K., 1996. Digital image processing.

converted to black and white picture by using Otsu method.

Prentice hall, Englewood Cliffs, New Jersey, USA.

data sets, which will enter to ANN, will be created by

Neuman, M. R., H. D. Sapirstein, E. Shwedyk and W.

converting black and white picture informations in 100x100

Bushuk. 1989. Wheat grain colour analysis by digital

size of each seeds to column matrix.

image processing. II. Wheat class discrimination.



Journal of Cereal Science 10: 183-188.



[3] Keefe, P. D. 1992. A Dedicated wheat grain image

3 CONCLUSION

analyzer. Plant Varieties and Seeds 5: 27-33.



Classification process with MLP model average success of

[4] Trooien, T. P. and D. F. Heermann, 1992. Measurement

the test was determined %100 in the structure, where 100

and simulation of potato leaf area using image

neurons are used, in the hidden layer. When creating the

processing.Model development. Transactions of the

MLP structure, neurons in the hidden layer and output layer

ASAE 35(5):1709-1712.

activation function was used as logarithmic sigmoid. The

[5] Pérez, A. J., F. Lopez, J. V. Benlloch and S.

error back-propagation was used in training of the ANN

Christensen. 2000. Colour and shape analysis techniques

model algorithm and network was trained 250 steps. The

for weed detection in cereal fields. Computers and

results which was obtained in classification byusing MCA

Electronics in Agriculture 25: 197-212.

process are presented in Table I.

[6] Dalen, G. V. 2004. Determination of the size



distribution and percentage of broken kernels of rice

using flatbed scanning and image analysis. Food

Table 1: Classification results with using MCA process

Research International 37: 51-58.



[7] Jayas, D. S. and C. Karunakaran. 2005. Machine vision

Number Of

system in postharvest tecnology. Stewart Postharvest

Classification success (%)

Neurons in

Rewiev, 22.

Hidden

Wheat

Barley

Average

[8] Fausett, L., 1994. Fundamentals of Neural Networks:

Layer

seeds

seeds

success

Architectures, Algorithms and Applications, Prentice

Hall.

25

44

48

92

[9] Kavas, G., Kavas, N., 2012. Gıdalarda yapay sinir ağları

50

45

49

94

ve bulanık mantık. DÜNYA yayıncılık, GIDA Dergisi

2012-01: 93-96.

75

47

49

96

[10] Yaman, K., Sarucan, A., Atak, M., Aktürk, N., 2001.

Dinamik çizelgeleme için görüntü işleme ve ARIMA

100

50

50

100

modelleri yardımıyla veri hazırlama. Gazi Üniv. Müh.

Mim. Fak. Dergisi,



16(1): 19-40.

In this study, gray level images information of wheat and

[11] Öztemel E., 2003. Yapay Sinir Ağları. İstanbul:

Papatya Yayıncılık

barley seeds by using image processing techniques.

Afterwards, the system was trained by using Otsu Method,



by converting binary picture information, by using multilayer



neural network model. Then, in the realized system, the



distinguishing of mixed wheat and barley seeds was



performed.



System can be developed by using moving band and camera



system and distinguishing of wheat and barley seeds can be

carried out in real-time. Also, packaging process of seeds in

a certain number can be performed. This study is an example

of using image processing and neural network in agricultural

field.





80



NOVI GOVOREC: NARAVNO ZVENEČ KORPUSNI

SINTETIZATOR SLOVENSKEGA GOVORA



Tomaž Šef

Odsek za inteligentne sisteme, Institut “Jožef Stefan”, Jamova cesta 39, 1000 Ljubljana

e-mail: tomaz.sef@ijs.si





POVZETEK

organizacijami, ker s tem podjetje pridobi dostop do



V

najnovejšega znanja in tehnologij. Povečuje se delež

članku je predstavljen prototip novega naravno

zvene

vlaganj v raziskovalno razvojno dejavnost v celotnem

čega

korpusnega

sintetizatorja

slovenskega

govora. Temelji na govorni zbirki, ki jo razvijata

prometu podjetja, poveča pa se tudi obseg sredstev za

Institut »Jožef Stefan« in podjetje Amebis. Trenutna

raziskovalno razvojno dejavnost.

demo verzija prototipa sintetizatorja uporablja

Pričujoče delo je nastalo v okviru projekta »Analiza in

četrtino

te zbirke. Podprta sta po en moški in en ženski glas.

ovrednotenje naprednih tehnologij govorjenega jezika v

Prototip je bil razvit v okviru projekta »Analiza in

pametnih stavbah« (Raziskovalni vavčer 2012, Amebis

ovrednotenje naprednih tehnologij govorjenega jezika v

d.o.o., Kamnik). Namen projekta oz. raziskave je pridobitev

pametnih stavbah« (Raziskovalni vavčer 2012, Amebis

novega znanja in spretnosti za nadgradnjo obstoječih

d.o.o., Kamnik).

sistemov govornih in jezikovnih tehnologij z namenom



uporabe v sodobnih inteligentnih vmesnikih pametnih stavb.

1 UVOD

Posebna pozornost je namenjena dinamičnemu podajanju



govornih informacij. Projekt vsebuje dve aktivnosti. V

Za angleški jezik in druge večje jezike so različni govorno

okviru prve aktivnosti so se kritično analizirale in

podprti sistemi že nekaj časa dosegljivi in imajo zelo širok

ovrednotile napredne tehnologije govorjenega jezika v

krog uporabnikov. V zadnjem času se čedalje pogosteje

pametnih stavbah. Sinteza govora, razpoznavanje govora,

uporabljajo tudi v različnih mobilnih aplikacijah, ki pa v

razpoznavanje govorcev ter njihovega psihofizičnega stanja

našem domačem slovenskem jeziku žal niso dostopne oz. ne

s pomočjo računalniške analize govorjenega zvočnega

delujejo.

signala,

odpirajo

povsem

nove

dimenzije

razvoja

Najbolj naravno zveneči sintetizatorji govora temeljijo na

inteligentnih uporabniških vmesnikov. Govorni vmesniki so

korpusni sintezi. Metoda temelji na preiskovanju vnaprej

nadvse primerna tudi kot pomoč invalidom (npr. slepim in

posnete in označene govorne zbirke. Išče se zaporedja tistih

slabovidnim), starejši populaciji in nekaterim drugim

posnetih glasov pri katerih se želene lastnosti čim bolj

družbenim skupinam. Druga aktivnost se osredotoča

ujemajo. Kvaliteta takšnih sintetizatorjev govora je

predvsem na dinamično podajanje informacij ali opozoril v

predvsem odvisna od zasnove govorne zbirke na kateri

govorni obliki. Takšni sistemi so jezikovno odvisni, zato

temeljijo. V splošnem velja, da je sintetizator govora

tujih rešitev ni mogoče kupiti oz. ustrezno prilagoditi našim

kvalitetnejši, če uporabljamo za sintezo daljše osnovne

potrebam. V Sloveniji se pojavlja čedalje večja potreba oz.

segmente s čim manj spremembami prozodičnih parametrov,

povpraševanje

po

kvalitetnem,

naravno

zvenečem,

saj te povzročajo dodatna popačenja sintetiziranega govora

razumljivem, čim širše sprejemljivem in splošno dostopnem

[1]. Strošek razvoja korpusnih sintetizatorjev je izredno

govornem bralniku slovenskih besedil, zato je bila v okviru

visok, zato je večinoma na razpolago le omejeno število

druge aktivnosti predlagana čim bolj optimalna zasnova in

glasov.

izvedba takšnega sistema. Rezultat te aktivnosti je tudi

Komercialne raziskave s področja govornih in jezikovnih

prototip novega Govoraca, ki bo v nadaljevanju podrobneje

tehnologij so pri nas pogojene z majhnostjo slovenskega

predstavljen.

trga. Iz stroškovnega vidika je povsem vseeno ali razvijamo



npr. sintetizator govora za jezik, ki ga govori milijarda ljudi,

2 GOVORNA ZBIRKA ZA KORPUSNO SINTEZO

ali pa zgolj dva milijona. Slovenski trg je izredno majhen

GOVORA

zato brez spodbud in subvencij s strani države razvoj tako



kompleksnih tehnoloških izdelkov in storitev ni mogoč. Ob

Najpomembnejša dejavnika pri snovanju govorne zbirke za

ustrezni subvenciji se za podjetje zmanjša tveganje

potrebe korpusne sinteze govora sta izbira njene vsebine in

(pre)velikih vlaganj v raziskave in razvoj, zato je podjetje

označevanje posnetkov. Izbira velikosti govorne zbirke je

pripravljeno vložiti tudi del lastnih sredstev. Potrebno je

posledica kompromisa med želenim številom variacij glasov

učinkovito

sodelovanje

z

raziskovalno

razvojnimi

oz. njihovim pokritjem na eni strani ter časom in stroški

vezanimi na razvoj na drugi strani. Upoštevati je potrebno



81

tudi čas za kasnejše preiskovanje govorne zbirke in potreben

• Doprinos povedi je enak vsoti vseh ocen zaželenosti

prostor za njeno hranjenje [2]. Kakovostna korpusna sinteza

nizov (iz spiska), ki se v povedi pojavijo.

zahteva, da ima govorna zbirka pravilno označeno tako

• Doprinos posamezne povedi normiramo z dolžino

identiteto posameznih govornih segmentov kot njihov

povedi (št. besed v povedi ali št. fonemov v povedi).

natančen položaj znotraj zbirke. Običajno avtomatskim

• Določimo takšno utež, da bodo dolžine izbranih

metodam in postopkom sledi »ročno« popravljanje oznak, ki

stavkov čim bolj ustrezale statistični porazdelitvi

ga je ne glede na hiter razvoj tehnologije še vedno zelo

dolžin stavkov iz korpusa.

veliko.

• Izberemo

poved

z

najvišjim

normiranim



doprinosom.

2.1 Zasnova govorne zbirke



• Iz spiska odstranimo vse glasovne nize, ki jih izbrana

Razvoj govorne zbirke za korpusno sintezo govora

poved vsebuje.

obsega naslednje korake:

• Ponovno ocenimo vsako poved in izberemo

• ustvari se obsežno tekstovno zbirko besedil, ki

najboljšo (glede na novi spisek v katerem so izločeni

pokriva različne zvrsti (dnevni časopis, revije,

tisti glasovni nizi, ki smo jih že pokrili) ter

leposlovje ipd.),

popravimo spisek.

• iz zbirke besedil se odstrani vse oznake vezane na

• Postopek ponavljamo dokler ne izberemo želenega

oblikovno podobo (glava besedila, tabele ipd.),

števila povedi.

• okrajšave, števila ipd. se pretvori v polno besedno

4. Ovrednotenje rezultatov:

obliko (normalizacija besedil),

• Vsakih 1000 povedi izdelamo statistiko difonov,

• besedila se pretvori v predvideni fonetični prepis

trifonov, štirifonov in drugih polifonov, ki jih že

(grafemsko-fonemska pretvorba),

pokrivamo (gre za glasovne nize, ki smo jih do takrat

• optimizira se obseg zbirke glede na vnaprej

že izločili iz zgoraj omenjenega spiska).

pripravljene kriterije (metoda požrešnega iskanja);

5. Dodatne izboljšave algoritma:

doseči

želimo

statistično

ustrezno

vzorčenje

• Ker mora zbirka vsebovati vse možne kombinacije

izbranega področja govorjenega jezika,

difonov, algoritem popravimo tako, da difone

• izbrane stavke se posname (ali pa se izlušči del

dodatno utežimo glede na ostale polifone. Na takšen

obstoječih zvočnih zapisov),

način bo algoritem na začetku dajal prednost

• posneto govorno gradivo se fonetično in prozodično

povedim, ki bodo pokrile čim več novih difonov.

označi (samodejno grobo označevanje, fino ročno

Predvidoma se vsi difoni pokrijejo že po ca. 100

popravljanje).

stavkih.

Postopek za čim optimalnejšo izbiro povedi:

• Pri trifonih in štirifonih upoštevamo pri robnih

1. Statistič na obdelava besedil:

glasovih tudi podatek o glasovni skupini, ki ji

• Statistično obdelamo celoten besedni korpus in

pripadajo (npr. štirifon "krak" ne bo doprinesel prav

določimo pogostost pojavljanja posameznih glasov in

dosti novega v našo zbirko, če ta že vsebuje štirifon

glasovnih nizov v besedilu. Pri tem razlikujemo še

"krat"; zato oceno koristnosti takega štirifona

med naglašenimi in nenaglašenimi glasovi ter glasovi,

popravimo navzdol). To lahko naredimo preprosto

ki se pojavljajo na koncu stavka (oz. na mestih

tako, da v spisek vnesemo dodatne nize skupaj z

zajema zraka - ločila). Presledke na drugih mestih

njihovimi frekvencami pojavljanja v korpusu (primer

lahko ignoriramo oz. odstranimo.

takega štirifona: "k"+"r"+"a"+"pripornik").

• Vključimo vse stavke (povedne, velelne, vprašalne

• Algoritem z različnim uteževanjem izboljšamo tako,

itd.) in izdelamo statistiko posameznih vrst povedi oz.

da končni nabor vsebuje različne povedi (povedne,

stavkov.

vprašalne,

velelne,

enostavne,

sestavljene,

2. Izdelava spiska glasovnih nizov z oceno zaželenosti

naštevanje, itd.). Tako lahko isti korpus učinkovito

posameznega niza:

uporabimo tudi za generiranje prozodičnih

• V spisek vključimo nabor vseh teoretično možnih

parametrov pri sintezi govora.

kombinacij difonov; tudi tiste na katere pri statistični



obdelavi

nismo

naleteli

(zaradi

robustnosti

2.2 Snemanje govorne zbirke



sintetizatorja govora).

Snemanje govorne zbirke je potekalo v studiu RTV

• V spisek vključimo vse trifone, štirifone in (po

Slovenija ob prisotnosti izkušenega tonskega tehnika. Med

potrebi) ostale zaželene (najpogostejše) polifone, na

10 profesionalnimi govorci smo izbrali najustreznejši moški

katere smo naleteli pri statistični obdelavi besedil.

in ženski glas. Med branjem besedila so govorci imeli

• Utež oz. ocena zaželenosti niza je odvisna od

nameščene elektrode Laryngographa, s katerimi smo

pogostosti njegovega pojavljanja v besedilu.

spremljali nihanje glasilk za lažje kasnejše označevanje

3. Postopek izbire povedi:

period govornega signala. Samo snemanje je zaradi

• Ocenimo doprinos glasovnih nizov za vsako poved iz

obsežnosti besedila, ki ga je bilo potrebno prebrati trajalo

tekstovnega korpusa.

več mesecev. Pri tem so nastavitve opreme ves čas ostale



82





nespremenjene. Pred vsakim snemanjem je govorec poslušal

Spremenjenih oz. na novo napisanih je le nekaj modulov:

svoje predhodne posnetke, s čimer se je skušalo zagotoviti

• modul za nastavljanje prozodičnih parametrov je

čim bolj enak način govora, z enako intonacijo ipd.

izpuščen; optimizacija nastavljanja teh parametrov je



sestavni del algoritma za izbiro najustreznejših

2.3 Statistični podatki o govorni zbirki

govornih segmentov



V tabeli 1 so podani osnovni statistični podatki o govorni

• možnost spreminjanja govornih parametrov je

zbirki za korpusni sintetizator slovenskega govora Amebis

namenoma okrnjena; algoritem skrbi le še za glajenje

Govorec.

prehodov na mestih lepljenja.



Potek korpusne sinteze [4, 5]:

Velikost besednega korpusa

7.145.345 povedi

• na razpolago imamo večje število primerkov

77 milijonov besed

posamezne enote,

Obseg govorne zbirke

4.000 povedi

• za vsak segment (difon), ki ga potrebujemo pri

(46.785 besed)

sintezi, v govorni bazi poiščemo takšnega, ki bo

Število različnih difonov

1.883

»najbolje« sintetiziral ciljni segment,

Število različnih trifonov

21.369

• najboljše zaporedje segmentov je tisto, ki minimizira

(št. kombinacij v korpusu)

(24.702)

ciljno ceno (angl. »target cost) in ceno združevanja

Tabela 1: Statistič ni podatki o govorni zbirki

(angl. »joint cost«) segmenta; problem je rešljiv z

Amebis Govorec

Viterbijevim algoritmom,



n

n

n

n

target

join

3 NOVI GOVOREC

C( t , u ) = ∑ C

( t , u ) + ∑ C

( u

, u )

1

1

i

i

i 1

−

i



i 1

=

i=2

Novi Govorec za sintezo neomejenega slovenskega govora v



osnovi ohranja nespremenjeno arhitekturo (slika 1) [3]:

ui predstavlja parametre i-tega segmenta, ui-1

• analiza besedila (predobdelava besedila, grafemsko

parametre njemu predhodnega segmenta, ti pa ciljne

fonemska pretvorba),

parametre i-tega segmenta,

• nastavljanje

prozodičnih

parametrov

(trajanje,

• prva vsota ponazarja ceno zaradi razlike med ciljno

osnovna frekvenca, amplituda, premori) in

in

dejansko

vrednostjo

parametrov

izbranih

• generiranje govornega signala (izbira osnovne enote,

segmentov, drugi vsota pa ceno zaradi neujemanja

lepljenje, sprememba govornih parametrov).

parametrov na mestu spajanja dveh segmentov,



• parametri, ki jih upoštevamo pri računanju ciljne

cene so: tip fonema, fonetični kontekst, naglas,

pozicija znotraj besede in povedi, tip povedi, f0,

trajanje ipd; posamezni parametri so različno uteženi

(wk),

p

t

C ( t , u )

t

t

= ∑ w C ( t , u )

i

i

k

k

i

i

k 1

=



• parametri, ki jih upoštevamo pri računanju cene

združevanja pa so: ujemanje f0, ujemanje energije,

ujemanje

formantov

in

drugih

spektralnih

karakteristik (MFCC koeficienti); tudi tukaj so

posamezni parametri različno uteženi (wk)

p

j

C ( u , u )

j

j

= ∑ w C ( u , u )

i 1

−

i

k

k

i 1

−

i

k 1

=



• uteži pri računanju cene združevanja nastavljamo

ročno s poslušanjem,

• uteži pri računanju ciljne cene lahko izračunamo

avtomatično [4] s povezavo akustičnih razdalj ter

višjenivojskih fonetičnih in prozodičnih parametrov;

uporabimo linearno regresijo.

Algoritmi, ki združujejo daljše segmente, se izkažejo za

boljše, zato k temu »teži« večina sodobnih algoritmov.



Optimizirajo se predvsem fonetični in prozodični



parametri,

cena

zderuževanja

zaradi

akustičnih

Slika 1: Zgradba sistema Amebis Govorec za sintezo

parametrov je bolj v ozadju oz. se sploh ne upošteva.

slovenskega govora



83

4 SKLEP

[3] T. Šef, Analiza besedila v postopku sinteze slovenskega



Izdelali smo prototip kvalitetnega, naravno zvene

govora, doktorska disertacija, Fakulteta za računal-

čega,

razumljivega

in

široko

sprejemljivega,

korpusnega

ništvo in informatiko, Univerza v Ljubljani, 2001.

sintetizatorja

slovenskega

govora.

Zaenkrat

je

[4] A. Hunt, A. Black: Unit selection in a concatenative

implementiranih le nekaj osnovnih algoritmov; naprednejši

speech synthesis system using a large speech database

algoritmi so še v razvoju in se testirajo. Sintetizator trenutno

Proceedings of ICASSP 96, vol 1, pp 373-376, 1996.

uporablja le četrtino govornega korpusa (ca. 1000 stavkov

[5] P. Taylor: Text-to-Speech Synthesis, Cambridge

na glas).

University Press, 2009.

Uporabljena govorna zbirka pokriva skoraj vse možne

[6] T. Šef, M. Romih: Zasnova govorne zbirke za

kombinacije difonov in trifonov na katere smo naleteli pri

sintetizator slovenskega govora Amebis Govorec,

analizi besednega korpusa s preko 7 milijoni povedi.

Zbornik

14.

mednarodne

multikonference

Snemanje govorne zbirke (moški in ženski glas) je potekalo

Informacijska družba, zvezek A, str. 88-91, 2011.

več mesecev. Za vsak glas je bilo prebranih preko 4.000

[7] M. Rojc, Z. Kačič: Design of optimal Slovenian speech

povedi povprečne dolžine 11 besed. Za lažje označevanje

corpus for use in the concatenative speech synthesis

zbirke smo poleg govornega signala posneli še signal

system, Proceedings of the Second international

Laryngographa, ki prikazuje nihanje glasilk. Sledil je ročni

conference on language resources an evaluation,

pregled posnetega gradiva in grobo samodejno označevanje;

Athens, Greece, str. 321-325, 2000.

temu sledi še fino popravljanje napak. Gre za najobsežnejšo

[8] J. Žganec Gros, A. Mihelič, N. Pavešič, M. Žganec, S.

izdelano govorno zbirko namenjeno sintezi slovenskega

Gruden: AlpSynth – Concatenation-based Speech

govora do sedaj [6,7,8].

Synthesis

for

the

Slovenian

Language,

47th

Novi Govorec je že na začetku razvoja presegel naša

International

Symposium

ELMAR-2005,

Zadar,

pričakovanja. V veliko delih je umetno generirani govor tako

Hrvaška, str. 213-216, 2005.

dober, da ga marsikateri poslušalec težko oz. sploh ne loči

od običajnih posnetkov (še posebej, če so ti predvajani preko

mobilnih komunikacijskih naprav ipd.). Naravnost in

razumljivost govora sta povsem primerljiva s sintetizatorji

govora za druge večje jezike. Poslušanje takšnega govora ni

naporno, zato je sintetizator primeren za najširši krog

potencialnih uporabnikov.

Z nadaljnjim razvojem lahko upravičeno pričakujemo še

dodatno občutno izboljšanje sintetiziranega govora. Do

konca leta bo v novega Govorca vključen celotni govorni

korpus,

osnovni

algoritmi

pa

bodo

nadgrajeni

z

naprednejšimi in kompleksnejšimi. Govorni korpus bo

dodatno pregledan in »očiščen« vseh zaznanih napak. Pri

izbiri

govornih

enot

bo

uporabljena

večkriterijska

optimizacija glede na akustične, fonetične in prozodične

kriterije. Uporabnik si bo sam izbral ali mu je ljubše, da

sintetizator govora govori s čim bolj naravno prozodijo, ali

pa mu je pomembnejša razumljivost in zveznost akustičnih

parametrov, izbiranje čim daljših govornih enot ter njihovo

lepljenje na fonetično najprimernejših mestih. Z drsniki ali

izbiro na Pareto fronti bo na preprost način podal svoje

preference.



Literatura



[1] A. Mihelič, J. Gros, N. Pavešič, M. Žganec:

Pridobivanje govorne zbirke za korpusni sintetizator

govora Phonectic, Zbornik konference Jezikovne

tehnologije, str. 45-49, 2000.

[2] I. Amdal, T. Svendsen: Unit selection Synthesis

Database Development Using Utterance Verification,

Zbornik INTERSPEECH 2005, str. 2553-2556, 2005.



84



CLOUD-BASED RECOMMENDATION SYSTEM FOR E-

COMMERCE



Gašper Slapnič ar1,2, Boštjan Kaluža2

1Faculty of Computer and Information Science, Večna pot 113, 1000 Ljubljana, Slovenia

2Department of Intelligent Systems, Jožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia

E-mail: slapnicar.gasper@gmail.com, bostjan.kaluza@ijs.si





ABSTRACT

2 CLOUD-BASED MACHINE LEARNING



PLATFORMS

This paper leverages cloud-based machine learning



platform to implement an item-based recommendation

Recently, a wide variety of cloud-based recommendation

system for an e-commerce application. The solution is

systems emerged. All of them offer various machine-

based on Prediction.IO platform, which offers a full-

learning algorithms, while the implementations of the

stack architecture based on MongoDB database, Hadoop

prediction model generation and supported algorithms differ.

framework for distributed processing, Apache Mahout

In our study, we focused on the currently available systems,

scalable machine learning library, and RESTful API. We

which offered distributed, scalable, full-stack architecture,

implemented an item-based recommendation engine for

simple API access and were documented to a degree that

product suggestions in an online retail store using real-

allowed us a basic understanding of the whole platform.

world data. Preliminary results are quite promising

Available options for development and implementation of

achieving Mean Average Precision of 6 %.

custom algorithms were also a priority and significant



advantage of a platform.

1 INTRODUCTION

BigML [3] is a commercial solution exclusively based on



decision trees and therefore very suitable for classifications.

A challenge that many retailers are facing today in the

It can be tested freely with smaller amounts of data and has

saturated online market is how to gain a competitive

a streamlined process for creating decision tree models. The

advantage and obtain cost-effective recommendation

interface is highly intuitive and the model creation is a matter

features without large investment into machine learning

of a few clicks. The main strength of it is the simplicity of

research and development. This challenge is difficult as we

usage and useful visualization of the generated decision tree

are getting large amounts of data even in smaller web

model. It is however limited to decision trees, which are not

applications. Therefore, it is expected that the solution is

suitable for recommendation problem.

simple to implement, fast, distributed and scalable [1].

QMiner [4] is an open source platform that offers high levels

As a solution to this challenge, cloud-based recommendation

of customization, a decent amount of available algorithms,

systems offer recommendation-as-a service. These solutions

and is well documented. It is implemented in C++ and offers

are an emerging trend lately, with an open source server

a JavaScript API. The architecture is distributed and

solution just recently raising $2.5 million in seed funding [2].

customizable. However, the platform is still in development

The main idea is to provide a recommendation engine in a

phase and faces some difficulties with deployment in

cloud as a solution, whereas a retailer provides data in

production environment.

utilizes the output form the cloud.

PredictionIO [5] is an open source solution implemented in

The aim of this paper is twofold. The first goal is to review,

Scala and offers APIs in most of the popular languages. It

compare and evaluate several cloud-based recommendation

shares similarities with QMiner’s distributed architecture

systems. The second goal is to implement a reference item-

and consists of four layers. At the bottom, there is a

recommendation system that can be used in a real-world

MongoDB database, followed by Hadoop framework for

application. As a result, the paper aims at a fully working

scalable distributed computing. The third layer is the heart of

solution that is able to address the major challenges –

the prediction model generation process – Apache Mahout, a

scalability together with simple API access and

scalable machine-learning library with many popular

implementation.

algorithms already implemented. The top layer contains an

Preliminary results show the level of accuracy we can obtain

API, which offers simple access to the prediction server.

by using the default set of algorithms and parameters on real-

Due to simple setup process, its open source nature and many

world data. These results can be later used as a baseline

available

machine-learning

algorithms,

we

chose

orientation point in further comparison with custom

PredictionIO for further steps.

solutions.

Other solutions such as SensePlatform [6], Google



Prediction API [7] and Microsoft Azure ML [8] are also





85





available commercial solutions that mainly focus on cloud-

based implementation of core machine-learning algorithms.



3 DATA



We obtained real-world data containing orders in an online

retail store in a period of one year. The data was described

by item (product) data, user (consumer) data and user-item

interactions (item bought) data, for example, user U1 made

an order O1 in which items A,B,C,D were bought.

In total, there were around 10.000 items, 36.000 users and

300.000 user-item interactions. There were some minor

occurrences of missing data as well as some duplicated

entries, which varied for a single character, yet represented

the same product. Those entries were simply filtered with

basic manual corrections. The only preprocessing included

replacing the encoding of the original dataset with utf-8

encoding.



4 ITEM-BASED RECOMMENDATION



Our goal was to develop a recommendation system by

leveraging the PredictionIO built-in engines. Figure 1 shows



the PredictionIO architecture comprising several engines.



Figure 1: PredictionIO server architecture [14].

Each engine processes data and constructs a predictive



model independently. There can be several engines within a

Collaborative filtering [11, 12] is among the most used and

single application, where each of them will serve its own

successful methods for this type of recommendation

prediction results based on corresponding predictive model.

systems. Collaborative filtering finds the users with similar

Each engine can be configured with a variety of options and

preferences (user based) in such a way that it finds items,

parameters for fine tuning, such as preference for newer

which were similarly rated by other users (item based).

items, preference for surprising discovery, custom attributes

User based approach has some issues, especially with

and most notably the goal to be maximized through

scalability, since computation grows with both the number

predictions. Based on the selected goal, which can be any

of users and the number of items; hence, item-based

action ( like, view, conversion) or a rating threshold (e.g.

approach is more common due to its simplicity and better

rating >= 3), it is possible to evaluate the available

scalability.

algorithms using built-in interface.

The basic idea of the item based approach is to take the items



some user has rated/bought and computes how similar they

4.1 Recommendation engines

are to other items. Based on this similarity it then selects k

PredictionIO server offers two recommendation engines:

most similar items.

item recommendation engine and item similarity engine.

Mahout’s item based collaborative filtering implementation

Each engine builds a prediction model based on the

is based on the pseudo-code shown in Figure 2. The

underlying algorithm. In both cases, prediction model is

algorithm computes similarity between pairs of items, where

generated using Mahout’s Collaborative Filtering [9]

one item of the pair is an item already preferred by a user,

methods.

and the other item of the pair is not.

Mahout’s

kNN

(k-nearest

neighbors)

Item

Based



Collaborative filtering was used in Item recommendation

engine and Mahout’s Item Similarity Collaborative Filtering

was used in Item similarity engine. Both are implemented to

be run either on a single machine or multiple machines in a

distributed/scalable setting.





Figure 2: Pseudo-code for recommendation algorithm.

To demonstrate the general algorithm from Figure 2 on an e-

commerce example, it tells us that for each product that is

not in a user’s shopping cart yet, it takes each product already

in the shopping cart and computes the similarity between

these pairs of products. It then weighs the computed

preference with frequency of common occurrences.



86





7,00%

4.2 Mean average precision as evaluation criteria

For result evaluation, PredictionIO offers built-in

6,00%

evaluations where we can regulate the size of the learning

5,00%

and testing set and also some other parameters such as the

number of predictions and number of iterations. These

4,00%

simulated evaluations use Mean Average Precision (MAP)

3,00%

to measure accuracy [10].

The critical step is the evaluation of similarity. PredictionIO

2,00%

offers several similarity evaluation options, such as

1,00%

Pearson’s correlation, Cosine similarity, Jaccard coefficient,

Log-likelihood ratio etc. This offers a simple option to

0,00%

compare the results each of them gives. In our case, Log-

Random

Item

Item similarity

likelihood ratio was chosen as it fits our problem of product

recommendation

recommendation

engine

recommendation best.

engine



This similarity measure is based on finding and counting the

Figure 3: Comparison of MAP prediction accuracy.

cases, where two items appear together [15] and is similar to

an expanded co-occurrence matrix.



If we want to measure the precision of the prediction for n

6 REFERENCE IMPLEMENTATION

“best recommendations” for some user, we use MAP@n:



Based on the comparison presented in the previous section,



we reused the best model and implemented PredictionIO

MAP@݊ ൌ ∑ே ܽ݌@݊

௜ୀଵ

௜ /ܰ,

within an Ajax based Django application, which implements



a demo e-commerce application. This allowed us to evaluate

where N is the number of all users, while ap@ ni is the

the response time and application behavior and reliability in

average precision for user i and is defined as:

a test user environment.





ap@݊ ൌ ∑௡

ܲሺ݇ሻ∆ܴሺ݇ሻ

௞ୀଵ

,



where P(k) is the precision with recommendation of k items,

which means the number of correctly recommended items

from the first k recommendations. Δ R (k) is the change of

recall in step k, which means 1/n if the k-th item was

recommended correctly, and 0 otherwise. [10]



5 EXPERIMENTAL RESULTS



We ran the evaluation in three iterations, each prediction

offering 20 items, with 70% of total data being the training

set and 30% being the testing set. The evaluation was first

ran for a random recommendation algorithm in order to get



a baseline with which we can compare our results. We then

Figure 4: Entry site of our demo e-commerce application [13].

ran it for item recommendation engine and item similarity

engine, using the above-mentioned Collaborative filtering



algorithms with Log-likelihood as a similarity measure.

The application offers available products to a user



dynamically upon input as seen on Figure 4 ( 1. Iskalno

The best result was 6% MAP accuracy using item similarity

okno), which allows us to prevent invalid entries and

engine as seen on Figure 3. This is notably better than the

improve the user experience. Upon selection a set of five

0.1%

MAP

accuracy

using

baseline

random

products is returned with minimal response time as seen on

recommendation. For our el-commerce example this means

Figure 5 ( 3. Priporoč eni izdelki).

that given 100 recommended products, 6 of these would



actually be chosen by a customer and added in their shopping

cart.





87





with product recommendation service based on real world

data. The application is capable of taking a set of products

chosen by the customer and returning a set of n products,

which are predicted to be the most likely to be bought by this

customer. The prediction model utilizes Prediction.IO

platform using Mahout’s Item Based Collaborative Filtering,

using Log-likelihood ratio as a similarity measure.

The preliminary experiments showed that we can expect up

to 6% MAP accuracy of predictions. This is an important

result, which can be used as a baseline value in future work.

It

allows

us

to

compare

a

custom-developed

recommendation algorithms with a basic out-of-the-box

solution.

The experiment established grand basis for further research,

that is, full-stack architecture, modeling, evaluation, and



testing. Further work will include implementation of a

Figure 5: Application offering 5 products to a user who selected a

custom engine for prediction and model generation, which

random product [13].

could be easily used in the prebuilt architecture.



As displayed on Figure 6, client connects to a web server

References

running on a remote machine. If no products are chosen, the



server returns a simple html site. When a client selects a

[1] Very Large Data Bases Endowment Inc., A. Labrinidis,

product from the list, the server detects it and collects the

H.V. Jagadish, Challenges and Opportunities with Big

Data

data from the shopping cart in form of strings. This data is

.

sent to PredictionIO. PredictionIO processes the data, applies

http://vldb.org/pvldb/vol5/p2032_alexandroslabrinidis_

a prediction model and then returns a JSON response

vldb2012.pdf, 2014-09-05

containing recommendations to the web server. The

[2] Techcrunch, S. O’Hear , PredictionIO Raises $2.5M

For Its Open Source Machine Learning Server.

webserver parses the response and displays it to the client. It



is important to note that minimal coding is required for

http://techcrunch.com/2014/07/17/predictionio/, 2014-

interaction with PredictionIO.

09-05



[3] BigML, Inc. https://bigml.com/, 2014-09-05

[4] QMiner. http://qminer.ijs.si/, 2014-09-05

[5] PredictionIO. http://prediction.io/, 2014-09-05

[6] Sense. https://senseplatform.com/, 2014-09-05

[7] Google

Developers,

Google

Prediction

API.

https://developers.google.com/prediction/?hl=sl, 2014-

09-05

[8] Microsoft

Azure

Machine

Learning.

http://azure.microsoft.com/en-us/services/machine-

learning/, 2014-09-05

[9] The Apache Software Foundation, Apache Mahout.

https://mahout.apache.org/users/recommender/

recommender-documentation.html, 2014-09-05

[10] Kaggle Inc. https://www.kaggle.com/wiki/



MeanAveragePrecision

Figure 6: UML diagram of client - web server - PredictionIO

[11] T. Segaran, Programming Collective Intelligence,

interaction.

2007

[12] B. Sawar, G. Karypis, J. Konstan, J. Riedl, Item-Based

A single machine implementation offered good response

Collaborative Filtering Recommendation Algorithms.

time for a small set of users trying to access the application

http://www.ra.ethz.ch/cdstore/www10/papers/pdf/p519

simultaneously. In order to achieve the scalability and serve

.pdf, 2014-09-05

multiple requests per second, we would have to deploy

[13] Demo e-commerce application. http://prediction-

multiple processing machines.

io.ijs.si:8001/



[14] Basics

of

PredictionIO.

7 CONCLUSION

http://docs.prediction.io/current/concepts/basics.html



We overviewed and compared a set of distributed, scalable

[15] T.

Dunning,

Surprise

and

Coincidence.

cloud-based platforms for machine learning. The goal we

http://tdunning.blogspot.com/2008/03/surprise-and-

achieved is a fully functional demo e-commerce application

coincidence.html, 2014-09-05



88



NOVEL IMAGE PROCESSING METHOD

IN ENTOMOLOGY



Martin Tashkoski, Ana Madevska Bogdanova

Ss. Cyril and Methodious University, Faculty of Computer Sciences and Engineering,

Rugjer Boshkovikj 16, 1000 Skopje, Macedonia

Tel: +389 2 3070377; fax: +389 2 3088222

e-mail: tashkoskim@yahoo.com, ana.madevska.bogdanova@finki.ukim.mk





ABSTRACT

recognition that will stop the transfer of contaged plants and



fruits at the border.

Image processing and machine learning together offer

Following our previous work [CIIT 2014], in this paper

powerful methods for image classification. In this paper

we propose a new method - Self – removing Noise Method

we present novel way of processing microscopic images

(SRNM) for automatic image background cleansing. We

for automatic classification of two similar insects

shortly describe the method for solving the problem for

belonging in the family Aleyrodidae, superfamily

distinction of the whiteflies using image processing and

Aleyrodoidea (whiteflies). They are very similar and can

machine learning techniques, and give some results gained

be distinguished only in a certain stage of their

from different tests. The rest of the paper is organized as

development (fourth larval stage or “pupae stage”).

follows. In Section 2 we present some of the work related to

Following our previous work, we propose a novel image

our problem. In Section 3 we present the entomology

processing method for automatically removing images’

problem and the basic idea for solving this problem. The

background noise. We also present results of a

results of the different test are presented in Section 4, and

classification process using the processed images with

the conclusion and future work are presented in the final

the proposed method.

Section 5.





1 INTRODUCTION



2 RELATED WORK

Microscopic image processing dates back half a century



In this section we briefly present some work related to the

when it was realized that some of the techniques of image

problem of these two whiteflies and some techniques and

capture and manipulation, first developed for television,

existing software for image processing.

could also be applied to images captured through the

The authors in [3] discovered the Bemisia tabaci and

microscope.

Trialeurodes vaporariorum in Macedonia. They explain the

Since agriculture has been important to people

danger of these pests and guess that these whiteflies were

thousands of years, we focus on microscopic images of

transferred from other neighboring countries.

pests that harms agricultural crops.

Authors in [6] explain all of the stages and the physical

In this paper we present one of the pests’ problems –

look, that can be useful for understanding the differences of

distinguinshing between two whiteflies lat. Trialeurodes

the whiteflies and choosing the best indicator for their

vaporariorum (less dangerous) and lat. Bemisia tabaci

distinction.

(more dangerous). They are small insects (about 1mm long)

The first step for solving this problem is undertaken in

with wings and bodies all covered with white, powdery wax

[5]. The authors proposed the algorithm Symmetrical Self –

[1]. They are fed with the plants juice and as a consecuence,

Filtration that can extract the important (vasiform orifice)

they may reduce vigour and growth of the plant [2].

part from the microscopic images in pupae stage. This is the

Bemisia tabaci is more dangerous than Trialeurodes

insect part we use for the classification process in our work.

vaporariorum because their larvae can inject some enzymes

Authors in [6] present an important insight in explaining

into the plant and those enzymes cause chlorosis or uneven

various methods for processing microscopic images. Their

ripening (depending on the plant), and induce physiological

work on this specific area of images is very useful, because

disorders [2].

often there are dust and particles that appears as noise in the

In Macedonia, these whiteflies can be found in the

image and the extraction of some information is very

southern parts and their occurrence has been observed 5 – 6

difficult task.

years ago [3]. It is assumed that they are transferred from

The authors in [7] and [8] are using Wolfram

neighboring countries by importing various plants that

Mathematica to solve some biological problems with image

causes problems with their spreading.

processing. The author in [7] presents a solution for

The border custom control has developed interest in

measuring the locations of particles of microscopic images.

preparing an intelligent system for dangerous pest

It is about needle – free injection devices that fire powdered



89





drug particles into the skin on a supersonic gas shock wave.

Thin slices of a target are photographed by microscope, and

on that images are applied different filters and the noise of

the images is removed to find the locations of the particles.

The author in [8] analyze a microscopic image of red blood

cells and count them, using morphological operations and

measurement tools.



3 THE METHODOLOGY



In this section we present the entomology problem at hand

and the methodology used for creating a descriptor, the

image processing process, removing the background of the

images, and the classification process.

Although the whiteflies look similar, they have



differences in each stage of their development. But the

problem is that these whiteflies can physically change their

Figure 2: Microscopic images of larva of Trialeurodes

look depending on the plant they feed themselves with,

vaporariorum

their environment and its temperature. According to [2], the



accurate indicator for distinction is based almost entirely on

3.2 Image Preprocessing



their fourth larval stage or “puparial stage”.

First of all, the captured images were cropped by applying

the algorithm for Symmetrical Self – Filtration [5], and as an

output we obtained the images with the characteristic part

(vasiform orifice). These images were used as input to our

method for automatically removing the background.





3.3 Image Processing - Self – removing Noise Method

(SRN Method)

Figure 1: Bemisia tabaci and Trialeurodes vaporariorum,



closer look to “vasiform orifice”

The new method proposed in this paper, the Self –

removing Noise Method, removes the noise and the

On Figure 1 we can see the two larvae of both whiteflies

background of the processed microscope images obtained

with their characteristic parts. The parts of the larvae are:

as output in 3.2. in order to prepare the images for the

operculum (1), vasiform orifice (2), lingula (3), caudal

classification process. The vasiform orifice part is not

furrow (4), caudal seta (5). The characteristic part that is the

clearly presented in all of the images. Some of them are

best indicator for distinction of these two whiteflies is

really noisy. For solving this problem we developed a

vasiform orifice (2). Vasiform orifice of Bemisia tabaci is

method that proceeds all of the images, applies some filters,

thinner than the one of Trialeurodes vaporariorum.

and after that as an output returns the images with the



vasiform orifice part in a white background.

3.1 Obtaining the training and test set

We will present the basic steps (1-6) of the method



In order to develop an Entomology classification System,

applied on all of the images - Figure 3.

the substantial part of the solution was to find appropriate

The SRN Method is as follows:

images of these two whiteflies. The images of Bemisia



tabaci were found on the Internet database. Since there is no

Step 1: importing the images with the vasiform orifice

available data base for the second whitefly, we had to make

part;

an image collection of ourown. The whitefly Trialeurodes

Step 2: converting each of the images to “grayscale”

vaporariorum was found in greenhouse in the southern parts

image;

of Macedonia, on the plants cucumber and tomato. There

Step 3: producing 3 categories for each image and

are several steps of taking these images: preparation of the

classifying the pixels according to their color intensity

biological samples, awareness of the mechanical damage of

(black, grey and white);

the samples, type and quality of the microscope. The most

Step 4: applying bilateral filter to the images;

difficult problem was that each larva was covered with dust

Step 5: morphological image processing;

which appeared as a noise at the image. At Figure 2 there

Step 6: converting each of the images to binary image;

are microscopic images with different zoom levels of the

Step 7: deleting the isolated small groups of black

larva. The top image is larva captured before removing the

pixels.

dust.



90





Figure 5 present some of the accepted images that were

obtained as outputs of the SRN Method.



Figure 5: Accepted images (the first row are images of

Bemisia tabaci, the second row are images of Trialeurodes



vaporariorum)

Figure 3: The basic steps in the Self – removing noise



Method for removing the background

4 TESTS AND RESULTS



In Step 1, we import the images with the vasiform orifice

The following tests we undertaken using the code written in

part. The second step converts the images to “grayscale”

Visual Studio 2010 in C# presented in [4]. The processed

style.

images obtained with the proposed SRN Method are used

as input for the code, and the output is a file with

In Step 3, for each image we form 3 dimensional vector

parameters formatted properly for Weka and SVM light. As

of pairs, in order to find the parameters that characterize the

we presented in paper [4], the best descriptor contains the

image the best way according their color intensity. Because

following parameters: five different widths of the image

the images were converted to “grayscale” style in Step 2, the

(number of pixels from the image in one row, taken

pixels values variy in scale of 0 to 1 (0 for black and 1 for

consecutively from top to bottom through the image on

white). First pair represents the number of pixels that has

equal distances), height, and ratio (height/bottom_width).

values 0 – 0.33 , the second pair – number of pixels with

The SRN Method was used on 346 images total, 49 of

values 0.33 – 0.67, and the third pair – number of pixels

Bemisia tabaci and 297 of Trialeurodes vaporariorum. The

with values 0.67 – 1 (Figure 3). This information is used in

images of Trialeurodes vaporariorum were gathered from

the following steps to adjust the appropriate filters.

different sources and contained different background

The fourth step is about applying bilateral filter on each

colors. The script as output obtained 326 images total (39 of

of the images, as it is shown at Figure 3. The bilateral filter

Bemisia tabaci and 287 of Trialeurodes vaporariorum)

[9] is a non – linear, edge – preserving and noise – reducing

ready for classification and 20 rejected images.

filter for images. The intensity value at each pixel in an



image is replaced by a weighted average of intensity values

4.1 Classification in Weka and SVM light

from the nearby pixels. This weight can be based on a



Gaussian distribution.

We made 3 groups with 10 tests and used different

In Step 5 we made some morphological image

classifiers in Weka and SVM light [11]. For the first two

processing. We used the method of closing as the basic rule

groups (presented in [4]) we had 109 images total with

for removing noise. Closing removes small holes and it

manually removed background (48 images of Bemisia tabaci

tends to enlarge the boundaries of bright regions in the

and 61 of Trialeurodes vaporariorum). In the first group we

image and shrink background color holes in such regions.

maintain even ratio in the test folder (we use 10 instances of

The sixth step is converting each of the images to binary

both classes for testing), and in the second group we

images, and after that in the last step we use function for

maintain even ratio in the training folder (we use 40

deleting the small groups of black pixels.

instances of both classes for training). For the third group we

Figure 4 presents some rejected images. The method

had 326 images total with automatically removed

counts the pixels at the borders (up, down, right and left). If

background (39 images of Bemisia tabaci and 287 images of

there are more black pixels than a half of the border length,

Trialeurodes vaporariorum). In this group we maintain the

these images are rejected for the classification process as

even ratio in the test folder (we use 10 instances of each

undistinguishable.

class for testing).



4.2 Results



In this section we present results of the classification process

using the images obtained with the proposed SRN Method

i.e. with automatically removed background. We maintain

the even ratio in the test folder.

In the paper [4] we have published the classification



Figure 4: Rejected images

results of manually cleaned images where we maintained

the even ratio in the test folder. According the average best



91





results in Weka, for Bemisia tabaci and Trialeurodes

using manually cleaned images in the classification process.

vaporariorum were obtained with the classifier lazy.IBk

We will extend our set of images of both pests and the final

(90% correctly recognized instances of Bemisia tabaci, and

step will be connecting all of these methods into one

95% correctly recognized instances of Trialeurodes

integrated system for fast recognition of the pests with an

vaporariorum). According to the average best results in

user - friendly interface. We also believe that the database of

SVM light for correct recognition of Bemisia tabaci were

the whitefly Trialeurodes vaporariorum we created, will be

obtained with the RBF kernel (for gamma=0.01) 89%, and

usefull to the other entomology researchers.

best results for correct recognition of Trialeurodes



vaporariorum were obtained with all of the kernels except

References

the RBF kernel (for gamma=0.01) 97%.



[1] G. S. Hodges, G. A. Evans. An identification guide to

For the new group of tests with the SRN method, we

the whiteflies (Hemiptera: Aleyrodidae) of the

have produced 10 tests, and for each test we have taken 10

southeastern United States. Florida Dept. Agriculture,

instances of Bemisia tabaci and 10 instances of Trialeurodes

Division of Plant Industry, Gainesville, FL 32614.

vaporariorum for testing of the total set with 326 instances.

2005.

For every test we performed classifications in Weka (for

[2] C. Malumphy. Protocol for the diagnosis of quarantine

different classifiers) and in SVM light (for different kernels).

organism Bemisia tabaci (Gennadius). Central Science

According the average best results (Figure 6) in Weka

Laboratory, Sand Hutton, York YO41 1LZ, UK.

for Bemisia tabaci, were obtained with the classifiers

[3] С. Банџо, Р. Русевски. Bemisia tabaci Genad.

Bayes.BayesNet (78% correctly recognized), and best

Присуство и распространување во Република

results for correctly recognizing Trialeurodes vaporariorum

Македонија, XXXI - во традиционално советување

were obtained with the classifier trees.J48 - 89%.

за заштита на растенијата на Република Македонија,

According to the average best results (Figure 6) in SVM

Охрид, 2006. In Macedonian.

light for correctly recognizing Bemisia tabaci were obtained

[4] M. Tashkoski, A. M. Bogdanova. Image classification

with the RBF kernel (for gamma=0.001) 70%, and best

in Entomology. In: Proceedings of 11th International

results for correct recognition of Trialeurodes vaporariorum

Conference

on

Informatics

and

Information

were obtained with all of the kernels except the RBF kernel

Technologies, CIIT 2014.

(for gamma=0.001) 85%.

[5] М.

Киндалов.

Препознавање

на

слики

во

ентомологијата и практична имплементација со

Bemisia

tabaci

и

Trialeurodes

vaporariorum.

Дипломска работа, ментор: проф. Д-р Ана Мадевска

Богданова, Скопје (2009). In Macedonian.

[6] Q. Wu, F. A. Merchant, K. R. Castleman. Microscope

image processing. ISBN: 978-0-12-372578-3. Elsevier

Inc. 2008.

[7] J. McLoone. Buiding a microscope application in

Mathematica,

2011.

http://blog.wolfram.com/2011/09/09/building-a-

microscopy-application-in-mathematica/

[8] S. Ashnai. How to count cells, annihilate

sailboats,

and

warp

the

Mona

Lisa,

2012.



Figure 6: Average of the results in Weka and SVM light with

http://blog.wolfram.com/2012/01/04/how-to-count-

even ratio of the both classes in the testing set (for

cells-annihilate-sailboats-and-warp-the-mona-lisa/

automatically cleaned images)

[9] C. Tomasi, R. Manduchi. Bilateral Filtering for Gray

and Color Images. Print ISBN: 81-7319-221-9.



Proceedings of the 1998 IEEE International Conference

6 CONCLUSION

on Computer Vision, Bombay, India, 1998.



[10] M. H. Malais, W. J. Ravensberg. Knowing and

In this paper we proposed a new method for automatically

recognizing - The biology of glasshouse pests and their

removing background of the images of Bemisia tabaci and

natural

enemies.

Publisher:

Koppept

Biological

Trialeurodes vaporariorum - Self – Removing Noise Method

Systems. ISBN 90 5439 126 X. 2003.

in order to develop an automatic system for classification.

[11] T. Joachims. Optimizing Search Engines Using

We tested with several classifiers, using the images that we

Clickthrough

Data,

Proceedings

of

the

ACM

obtained with this method and we presented the results.

Conference on Knowledge Discovery and Data Mining

According the new results we can conclude that the Self –

(KDD), ACM, 2002.

removing noise method proved as a good solution for

automatically cleaning the images. The SRN Method that we

used for filtering the images will be improving in the future

in order to obtain results approximate to the results when



92



ARHITEKTURA SISTEMA OpUS



Aleš Tavč ar1,2, Jure Šorn1, Tea Tušar1, Tomaž Šef1, Matjaž Gams1,2

Odsek za inteligentne sisteme, Institut »Jožef Stefan« 1

Jamova cesta 39, 1000 Ljubljana, Slovenija

Mednarodna podiplomska šola Jožefa Stefana2

Jamova cesta 39, 1000 Ljubljana, Slovenija

e-mail: {ales.tavcar, tea.tusar, tomaz.sef, matjaz.gams}@ijs.si





POVZETEK

Sistem OpUS, ki je bil razvit v okviru projekta e-storitve



Obstoje

za gospodarstvo, je samostojna in robustna rešitev, ki se

či sistemi hišne avtomatizacije oz. pametnih

stavb ne omogo

lahko vgradi v široko paleto obstoječih sistemov hišne

čajo naprednih funkcionalnosti, ki bi jih

od takih sistemov pri

avtomatizacije. Združuje več inovativnih komponent, kjer

čakovali. Trenutne rešitve

omogo

vsaka skrbi za določen vidik inteligentnega upravljanja

čajo zgolj spremljanje stanja sistemov in okolja v

hiši ter krmiljenje hišnih naprav preko mobilnih naprav

pametne stavbe.

in spleta. Definiranje urnikov delovanja je prepuš

V nadaljevanju prispevka najprej opravimo kratek

čeno

samim uporabnikom, kar se običajno odraža v večji

pregled sorodnega dela. Nadaljujemo s predstavitvijo

porabi in neučinkovitem delovanju. V pričujočem

arhitekture sistema OpUS in na kratko opišemo naloge

prispevku so predstavljeni načini za nadgradnjo

posameznih komponent sistema.

sistemov hišne avtomatizacije z inteligentnimi metodami

učenja navad uporabnikov in metodami optimizacije

2 SORODNI SISTEMI

delovanja. Tak sistem je zmožen spremljanja obnašanja

Obširen pregled sodobnih sistemov vodenja v pametnih

uporabnikov, se učiti njihovih navad in prilagajati

zgradbah je opisal Dounis et.al [1]. Za razliko od klasičnih

delovanje glede na spreminjajoče se življenjske navade

načinov vodenja, je za optimalno, prediktivno ali adaptivno

in potrebe.

vodenje potrebno imeti model zgradbe. Temu pristopu

sledimo v tem prispevku.

1 UVOD

Vodenje sistemov z uporabo podatkov in znanj o

Trenutne komercialne rešitve hišne avtomatizacije ne

uporabnikih in okolju predstavlja nove smernice raziskav in

ponujajo naprednih funkcionalnosti, ki bi jih pričakovali v

razvoja tako imenovanega vseprisotnega in prodornega

takih sistemih. Namesto pametnega predvidevanja navad in

računalništva (ang. ubiquitous computing, pervasive

potreb uporabnika obstoječi sistemi ponujajo zgolj

computing), saj sodobne naprave, senzorji in aktuatorjih, ki

krmiljenje hišnih naprav ter nadzor stanja v zgradbi. Slednji

se vse bolj množično pojavljajo v zgradbah (senzorji

poteka preko različnih pametnih naprav in v redkih primerih

prisotnosti, senzorji gibanja, senzorji odprtosti oken,

preko spletnih vmesnikov. Nastavljanje režima delovanja in

senzorji na mobilnih telefonih, osebne vremenske postaje

scenarijev je prepuščeno samim uporabnikom, iz česar

itd.) omogočajo beleženje najrazličnejših informacij in

običajno sledi, da so urniki nastavljeni površno in zato

kopičenje znanj tako o obnašanju posameznega uporabnika,

neučinkovito. Poleg tega se navade uporabnikov neprestano

kot o obnašanju sistema. Uporaba takšnih znanj se izkorišča

spreminjajo. V določenem obdobju lahko npr. uporabniki

v sistemih, ki spodbujajo uporabnike k zmanjšanju porabe

začnejo prihajati domov kasneje; ob nespremenjenem

energije s spodbujanjem k na primer nižanju želenih

urniku to pomeni, da se začne ogrevanje hiše prezgodaj, kar

temperatur ogrevanja ali pa k izbiri primernih prostorov v

se odraža v večji porabi energentov in posledično v višjih

službi za potrebe sestankov (manj ljudi - manjši prostor -

obratovalnih stroških. Poleg tega se vedno bolj uveljavlja

manj energije za ogrevanje) [3]. Znanje o uporabnikih se

uporaba fotovoltaike kot komplementaren vir energije.

izkorišča za gradnjo modelov uporabnikovega obnašanja in

Predpostavljamo, da lahko z inteligentnim kombiniranje

uporabo le-teh pri vodenju in adaptaciji sistemov ogrevanja,

različnih virov energije zmanjšamo stroške obratovanja in

razsvetljave, prezračevanja in ogrevanja sanitarne vode

hkrati ohranimo zadovoljivo stopnjo udobja uporabnikov.

[4,5]. Prihranki energije se gibljejo med 5-30%.

Za reševanje zgornjih nalog predlagamo sistem, ki je

Veliko projektov na temo izvedbe testnih pametnih hiš

sposoben sprotnega spremljanja dogajanja v hiši, učenja

in stanovanj je bilo že dokončanih. Leta 1990 so izdelali

navad uporabnikov, prilagajanja delovanja glede na

Neural Network House [6], kjer so uporabljali nevronske

spremembe v navadah uporabnikov in optimiziranja

mreže za inteligentno vodenje sistemov. Sledila sta

delovanja celotnega sistema hišne avtomatizacije.

IHome[7] in MavHome[8], temelječa na inteligentnem več-

agentnem pristopu nadzora in vodenja sistemov z uporabo



93





tehnik za modeliranje in napovedovanje uporabnikovega

Zadnja dva tipa agentov sta še podporni in

obnašanja in akcij. Gator Tech Smart House [9] je splošno

komunikacijski agenti, ki so lastni arhitekturi in skrbijo za

uporaben študijski projekt za raziskavo tehnik vseprisotnega

prenašanje sporočil med agenti, beleženje posameznih

računalništva (ang. pervasive computing). Eden zadnjih

agentov, posredovanje podatkov, ontologij ipd. Simbolična

projektov - ThinkHome [10] uporablja širok nabor

shema arhitekture je prikazana na sliki 2. Sama arhitektura

podatkov o okolju, vremenu in uporabniku za namene

je zasnovana tako, da je omogočeno enostavno dodajanje in

študije vodenja pametnih domov.

odstranjevanje agentov v sistem.

Vsi sistemi se povečini osredotočajo zgolj na določene

vidike upravljanja stavbe. ThinkHome, na primer, poskuša

predvidevati

temperaturo,

ki

bo

za

uporabnika

najudobnejša. Poleg tega poskuša predvidevati, kdaj bo nek

uporabnik prisoten. Sistem OpUS je obširnejši, saj poskuša

modelirati večje število parametrov pametne hiše obenem

pa v algoritme vodenja vključuje uporabniške navade in

optimizacijske algoritme.

3 ARHITEKTURA SISTEMA

Arhitektura sistema OpUS je definirana s hierarhično

urejenim

več-agentnim

sistemom

[11].

Agenti

so



avtonomne entitete, ki so sposobne zaznavanja in

interakcije z okolico skladno z njihovimi preferencami,

Slika 1: Hierarhič na več -agentna arhitektura.

lastnostmi in aktivnimi cilji. Sposobni so samostojnega

Sistem je logično razdeljen na dva dela, kot je prikazano na

razmišljanja in sodelovanja za dosego skupnih ciljev. Vsak

sliki 3. Zgornji nivo vsebuje tako imenovano inteligenco,

agent v OpUS agentni arhitekturi je določen z agentno

torej agente, ki skrbijo za generiranje novih, za uporabnika

ovojnico, ki natančno definira posameznega agenta.

ustreznejših urnikov. Spodnji nivo (t. i., hrbtenica) vsebuje

Ovojnica določi vhodne podatke, ki jih agent zahteva,

agente vodenja, ki skrbijo za izvajanje urnikov in sprotno

akcije, ki jih lahko izvaja in izhodne podatke, ki jih lahko

sinhronizacijo z realnim okoljem. Pomembna predpostavka

posreduje.

v sistemu je ta, da lahko sistem ob morebitnem izpadu

3.1 Hierarhično urejena agentna arhitektura

zgornjega dela še vedno nemoteno deluje in skrbi za

upravljanje z napravami v zgradbi.

Agentna arhitektura ni samo skupek agentov, ki določa



hierarhične odnose med njimi, ampak skrbi za beleženje

stanja vsakega agenta, določa način komunikacije med

njimi in omogoča podporo za sprotno simuliranje dinamike

v prototipnem okolju.

Sistem OpUS sestavlja šest različnih tipov agentov. Agent

pasivna naprava je naprava, ki zgolj beleži in posreduje

določen podatek. Primer takega agenta je senzor

temperature zraka. Agent aktivna naprava je naprednejši

in omogoča upravljanje z napravo. Primer takega agenta je

toplotna črpalka, ki se ga lahko vklaplja, izklaplja in

nastavlja na določeno temperaturo. Naslednji tip agentov je

program, slednji združuje agente, ki ponujajo različne

servise znotraj arhitekture. V to kategorijo spadajo agent



dnevnik, ki beleži spremembe v sistemu, agent upravljanja

Slika 2: Povezovanje agentov znotraj arhitekture.

sistema, agenti za učenje, agenti za optimizacijo, simulacijo

ipd. Četrti tip agentov hišnik so upravljavski agenti, ki

3.2 Učenje navad uporabnika

omogočajo nadzor in upravljanje določenih sklopov

pametne stavbe. Odvisno od razdelitve in same hierarhične

V sistem OpUS je vključen modul za učenje navad

strukture, lahko ti agenti upravljajo sistem ogrevanja,

uporabnika. Slednji poskuša zgraditi model obnašanja

določeno sobo ali pa celotno zgradbo. Agent hišnik, ki je v

uporabnika na podlagi opazovanja prisotnosti v hiši in v

agentni hierarhiji postavljen najvišje, skrbi tako za

posameznih sobah. Zgrajeni model lahko za določeno

koordinacijo med ostalimi višjenivojskimi agenti, kot tudi

obdobje napove verjetnost, da je uporabnik prisoten ali

za komunikacijo z zunanjim svetom. Zgradba je namreč

odsoten. Napoved prisotnosti uporabnika omogoči celo

lahko vključena v pametno mesto, ki lahko od nje zahteva

vrsto dodatnih storitev, ki so v sodobnem sistemu hišne

aktivno pogajanje o cenah energentov ipd.

avtomatizacije nujne. Natančna napoved časa odhoda

uporabnika omogoča varčevanje z energijo, saj lahko sistem



94





predčasno izklopi ogrevanje. Podobno lahko natančna

večkriterijski optimizacijski algoritem poišče množico

napoved časa prihoda omogoči povečanje občutka udobja

urnikov, ki predstavljajo kompromisne rešitve glede na

pri uporabnikih.

obravnavana nasprotujoča si kriterija. Dobljene rešitve

Gradnja modela poteka tako, da sistem dlje časa beleži

predstavljajo približek za t.i. Pareto fronto [12], na osnovi

senzorske podatke o prisotnosti in sproti gradi verjetnostni

katere nato uporabnik preko uporabniškega vmesnika izbere

model navad uporabnika. Prednost sprotne gradnje je v tem,

nov, zanj najustreznejši urnik.

da omogoča prilagajanje na spremembe v navadah

Algoritem gradi množico rešitev iz začetnega urnika, s

uporabnika. Sistem nato dnevno uporabi zgrajeni model za

preiskovanjem celotnega prostora urnikov in uporabo

napoved časa odhoda in prihoda uporabnika. S tem

simulatorja za ovrednotenje generirane rešitve iz vidika

optimizacijskemu agentu zmanjša preiskovalni prostor, saj

porabe in udobja. Tekom izvajanja se generirane rešitve

določi intervale, kjer z večjo verjetnostjo pride do

izboljšujejo in najboljše rešitve z vidika obeh kriterijev

sprememb v režimu delovanja.

tvorijo Pareto fronto nedominiranih urnikov. Uporabnik

lahko preko uporabniškega vmesnika izbere enega od teh

3.3 Model in simulacija zgradbe

urnikov glede na preference o načinu vodenja stavbe. Za

Sposobnost

natančnega

simuliranja

realnega

okolja

izvajanje izbranega urnika poskrbi agentna arhitektura, ki

omogoča predhodno verifikacijo strategij vodenja in izbiro

sočasno upravlja simulirano in realno prototipno okolje.

tiste, ki vodi do najustreznejšega stanja sistema. V sistemu

3.5 Komunikacijski vmesnik

OpUS je simulacija ključna za poganjanje optimizacijskih

algoritmov in sprotnega izvajanja dinamike sistema.

Eden pomembnih modulov v arhitekturi je komunikacijski

Razviti

simulacijski

agent

uporablja

orodje

vmesnik, ki skrbi za sinhronizacijo agentne arhitekture z

EnergyPlus[13], ki omogoča simuliranje poljubnih modelov

realnim kontejnerskim prototipom. OpUS preko HTTP

stavb in različnih naprav. Ključna naloga je zgraditi model

zahtevkov

pridobiva

informacije

o

stanju

naprav

stavbe, ki kar se da natančno odraža realno prototipno

priključenih na kontrolerje.

okolje. Orodje omogoča definiranje naprav s poljubnimi

Primer zahtevka za branje temperature v prostoru:

specifikacijami, omogoča integracijo fotovoltaike in

http://www.exp.com/scgi/?c11025.ts01_temperature

akumulatorjev. Za čim bolj natančno izvajanje simulacije

_0

uporablja informacije o geografski lokaciji modela, realne



informacije o vremenu in sončnem obsevanju. Simulacijski



agent omogoča sprotno spreminjanje parametrov delovanja

Pri čemer je »c11025« CyBro krmilnik z naslovom 11025,

in izpis stanja naprav. Na sliki 3 je prikazana zaslonska

ts01_temperature_o pa je ime spremenljivke v krmilniku.

maska pregleda porabe posameznih naprav znotraj



simulacije.

Strežnik vrne odgovor v formatu XML:





<data>

<var>

<name>c11025.ts01_temperature_0</name>

<value>268</value>

<description>

Measured internal temperature, multiplied

by 10 (e.g. 247 means 24.7°C).

</description>

</var>

</data>



Dobljene vrednosti se uporabljajo za sinhronizacijo agentne

arhitekture z dogajanjem v realnem okolju, tako da se

vrednosti v XML odgovoru prepišejo v ustreznega agenta-

napravo.



Vodenje sistema poteka na podoben način preko http

Slika 3: Prikaz posamezne porabe tekom simulacije.

zahtevkov. Na podlagi vrednosti v izbranem urniku pošlje

sistem zahtevek za spremembo delovanja naprave. Prednost

3.4 Optimizacija delovanja

takega pristopa je decentraliziranost, saj ni potrebno, da sta

Sistem OpUS uporablja optimizacijo za izračun novih

sistem OpUS in krmilnik locirana na istem strežniku. S tem

urnikov, ki so za uporabnika ustreznejši in omogočajo

tudi ločimo inteligentni del sistema od logičnega dela, ki

manjše stroške in večje udobje. Z uporabo simulatorja

skrbi zgolj za izvajanje ukazov na krmilniku.

stavbe, ki za podani urnik vrača stroške in udobje,



95



review." Renewable and Sustainable Energy Reviews

13.6 (2009): 1246-1261.

[2] R. Baños, F. Manzano-Agugliaro, F.G. Montoya, C. Gil,

A. Alcayde, J. Gómez, Optimization methods applied to

renewable and sustainable energy: A review, Renewable

and Sustainable Energy Reviews, Volume 15, Issue 4,

May 2011, Pages 1753-1766, ISSN 1364-0321,

http://dx.doi.org/10.1016/j.rser.2010.12.008

[3] Laura Klein, Jun-young Kwak, Geoffrey Kavulya,

Farrokh Jazizadeh, Burcin Becerik-Gerber, Pradeep

Varakantham, Milind Tambe, Coordinating occupant

behavior for building energy and comfort management

using multi-agent systems, Automation in Construction,

Volume 22, March 2012, Pages 525-536, ISSN 0926-



5805, http://dx.doi.org/10.1016/j.autcon.2011.11.012

[4] Lu, Jiakang, et al. "The smart thermostat: using

Slika 4: Upravljanje z agenti aktuatorji.

occupancy sensors to save energy in homes."

Na sliki 4 je prikazan vmesnik za upravljanje z

Proceedings of the 8th ACM Conference on Embedded

napravami

v

sistemu

OpUS

in

hkrati,

preko

Networked Sensor Systems. ACM, 2010.

sinhronizacijskega mehanizma, v realnem prototipnem

[5] Agarwal, Yuvraj, et al. "Occupancy-driven energy

okolju. Na zaslonski maski je prikazana trenutna poraba

management

for

smart

building

automation."

naprav ter celotnega sistema, prikazani so podatki iz

Proceedings of the 2nd ACM Workshop on Embedded

senzorjev, stanje baterije in trenutna vremenska napoved.

Sensing Systems for Energy-Efficiency in Building.

Posamezne naprave je mogoče priklapljati in izklapljati ter

ACM, 2010.

nastavljati parametre delovanja.

[6] M. C. Mozer, The neural network house: An

environment hat adapts to its inhabitants, Proc AAAI

4 ZAKLJUČEK

Spring Symp Intelligent Environments. pp. 110–114,

(1998)

V pričujočem prispevku smo opisali celotno arhitekturo

[7] V. Lesser, M. Atighetchi, B. Benyo, B. Horling, V. L.

sistema OpUS in opisali glavne module sistema. Uporaba

M. Atighetchi, A. Raja, R. Vincent, P. Xuan, S. X.

več-agentne paradigme olajša razvoj kompleksnih sistemov,

Zhang, T. Wagner, P. Xuan, and S. X. Zhang. The

kjer je pomembna sprotna komunikacija in učinkovito

intelligent home testbed. In Proceedings of the

sodelovanje med komponentami.

Autonomy Control Software Workshop, (1999).

Vseh pet glavnih agentov sistema je načrtovanih tako,

[8] D. Cook,M. Youngblood, I. Heierman, E.O., K.

da lahko uporabljajo storitve drugih agentov in hkrati

Gopalratnam, S. Rao, A. Litvin, and F. Khawaja.

ponudijo na razpolago svoje funkcionalnosti. Tipičen

Mavhome: an agent-based smart home. In Pervasive

primer je simulacijski agent, ki uporablja storitve vmesnika

Computing and Communications, 2003. (PerCom 2003).

ter podatke v agentni arhitekturi in ponuja storitve

Proceedings of the First IEEE International Conference

simulacije optimizacijskemu agentu ter omogoča sprotno

on,

pp.

521

–

524

(march,

2003).

doi:

izvajanje s strani agente arhitekture.

10.1109/PERCOM.2003.1192783.

Takšen dinamičen in inteligenten sistem lahko znatno

[9] S. Helal, W. Mann, H. El-Zabadani, J. King, Y.

zniža porabo in s tem stroške inteligentne hiše. S tem pa

Kaddoura, and E. Jansen, The gator tech smart house: a

vzpodbuja

ekološko

ozaveščenost

uporabnikov

in

programmable pervasive space, Computer. 38(3), 50 –

zmanjšuje negativne vplive na okolje.

60

(march,

2005).

ISSN

0018-9162.

doi:

Obstoječi

tržni

sistemi

hišne

avtomatizacije

10.1109/MC.2005.107.

predstavljenih rešitev ne uporabljajo. Deloma zaradi višjih

[10] C. Reinisch, M. J. Kofler, F. Iglesias, and W. Kastner,

stroškov razvoja, kar se direktno preslika v ceno takega

Thinkhome energy efficiency in future smart homes,

sistema, deloma pa zaradi trenutno nezanesljivega in

EURASIP J. Embedded Syst. 2011, 1:1–1:18 (Jan.,

nerobustnega delovanja. Sistem OpUS se uspešno spopada

2011). ISSN 1687-3955. doi: 10.1155/2011/104617

z navedenimi težavami in z uporabniškega stališča omogoča

[11] Multi-Agent system:

prijazno uporabo in veliko mero avtonomnosti.

https://en.wikipedia.org/wiki/Multi-agent_system

Reference

[12] Kalyanmoy Deb. Multi-Objective Optimization Using

Evolutionary Algorithms. Wiley, Nw York, 2001.

[1] Dounis, Anastasios I., and Christos Caraiscos.

[13] EnergyPlus.

"Advanced control systems engineering for energy and

http://apps1.eere.energy.gov/buildings/energyplus/energ

comfort management in a building environment—A

yplus_about.cfm



96

PREDICTIVE PROCESS-BASED MODELING

OF AQUATIC ECOSYSTEMS



Nina Vidmar1, Nikola Simidjievski2,3, Sašo Džeroski2,3

Faculty of Mathematics and Physics, University of Ljubljana, Ljubljana, Slovenia1

Jožef Sefan Institute, Ljubljana, Slovenia2

Jožef Stefan International Postgraduate School, Ljubljana, Slovenia3

e-mail: nina.vidmar@student.fmf.uni-lj.si, {nikola.simidjievski, saso.dzeroski}@ijs.si

ABSTRACT

number of aquatic ecosystems, such as: lake ecosystems [4,

5, 6, and 7] and marine ecosystems [3]. However, these

In this paper, we consider the task of learning

studies focus on obtaining explanatory models of the aquatic

interpretable process-based models of dynamic systems.

ecosystem, i.e., modeling the measured behavior of the

While most case studies have focused on the descriptive

system at hand, while modeling future behavior is not

aspect of such models, we focus on the predictive aspect.

considered. In contrast, Whigham and Recknagel [8] discuss

We use multi-year data, considering it as a single

the predictive performance of process-based models in a lake

consecutive dataset or as several one-year datasets.

ecosystem. However, either they assume a single model

Additionally, we also investigate the effect of

structure and focus on the task of parameter identification, or

interpolation of sparse data on the learning process. We

explore different model structures where the explanatory

evaluate and then compare the considered approaches on

aspect of the model is completely ignored. The method

the task of predictive modeling of phytoplankton

proposed by Bridewell et.al [9] focuses of establishing robust

dynamics in Lake Zürich.

interpretable process-based models, by tackling the over-



fitting problem. Even though this method provides estimates

of model error on unseen data, these estimates are not related

1 INTRODUCTION

to the predictive performance of the model, i.e., its ability to

Mathematical models play an important role in the task of

predict future system behavior beyond the time-period

describing the structure and predicting the behavior of an

captured in training data. Most recently, the study of

arbitrary dynamic system. In essence, a model of a dynamic

Simidjievski et.al [10] focuses on the predictive performance

system consists of two components: a structure and a set of

of process-based models by using ensemble methods.

parameters. There are two basic approaches to constructing

However, while their proposed ensemble methods improve

models of dynamic systems, i.e., theoretical (knowledge-

the predictive performance of the process-based models, the

driven) modeling and empirical (data-driven) modeling. In

resulting ensemble model is not interpretable.

the first, the model structure is derived by domain experts of

In this paper we tackle the task of establishing an

the system at hand, the parameters of which are calibrated

interpretable predictive model of a dynamic system. We focus

using measured data. In contrast, the later uses measured data

on predicting the concentration of phytoplankton biomass in

to find the most adequate structure-parameter combination

aquatic ecosystems. Due to the high dynamicity and various

that best fits the given task of modeling. In both approaches,

seasonal exogenous influences [6, 7], most often process-

models often take the form of ordinary differential equations

based models of such systems are learned using short time-

(ODEs), a widely accepted formalism for modeling dynamic

periods of observed data (1 year at most). Note however, this

systems, allowing the behavior of the system to be simulated

short time-periods of data are very sparse, i.e., consist of very

over time.

few measured values, thus, most often the measurements are

Equation discovery [1, 2] is the area of machine learning

interpolated and daily samples are obtained from the

dealing with developing methods for automated discovery of

interpolation.

quantitative laws, expressed in the form of equations, from

The initial experiments to this end, indicate that the

collections of measured data. The state-of-the-art equation

predictive performance of such models is poor: While

discovery paradigm, referred to as process-based modeling

providing dense and accurate description of the observed

[3], integrates both theoretical and empirical approaches to

behavior, they fail at predicting future system behavior. To

modeling dynamics. The result is a process-based model

address this limitation we propose learning more robust

(PBM) – an accurate and understandable representation of a

process-based models. We conjecture that by increasing the

dynamic systems.

size of the learning data, more general process-base models

The process-based modeling paradigm has already been

will be obtained, thus yielding better predictive performance

proven successful for modeling population dynamics in a

while maintaining their interpretability.

97



Figure 1: Automated modeling with ProBMoT.

The main contribution of this paper are the approaches to

A process-based model consists of two basic types of

handling the learning data. The intuitive way of increasing the

elements: entities and processes. Entities correspond to the

size of the learning data is by sequentially adding

state of the system. They incorporate the variables and the

predeceasing contiguous datasets, thus creating one long

constants related to the components of the modeled system.

time-period dataset, i.e., learning from sequential data (LSD).

Each variable in the entity has its role. The role specifies

In contrast, when learning from parallel data (LPD), the

whether the variable is exogenous or endogenous. Exogenous

model is learned from all the datasets simultaneously. Figure

variables are explanatory/input variables, used as forcing

2 depicts the both approaches. The two approaches, in terms

terms of the dynamics of the observed system (and are not

of learning process-based models, are described in more

modeled within the system). Endogenous variables, are the

detail in Section 3.

response/output (system) variables. They represent the

We test the utility of the two approaches on a series of

internal state of the system and are the ones being modeled.

tasks of modeling phytoplankton concentration in Lake

The entities are involved in complex interactions represented

Zürich. We use eight yearly datasets, using six for training,

by the processes. The processes include specifications of the

one for validation and one for testing the predictive

entities that interact, how those entities interact (equations),

performance of the obtained models. The aim of this paper is

and additional sub-processes.

two-fold: besides validating the performance of the two

From the qualitative perspective, the unity of entity and

approaches to handling data when learning predictive

processes allows for conceptual interpretation of the modeled

process-based models, we also test the quality of the training

system. On the other hand, the entities and the processes

data. For that purpose, we perform additional set of

provide further modeling details that allow for transformation

experiments, similar to the previous. However, instead of

from conceptual model to equations and therefore simulation

using the interpolated data for learning the models – we use

of the system, i.e., providing the quantitative abilities of the

the original (sparse) measured values, thus examining the

process-based model. The equations define the interactions

influence of the interpolation on the predictive performance

represented by the processes including the variables and

of the process-based models.

constants from the entities involved.

The next section provides more details of the task of

The process-based modeling paradigm allows for high-

process-based modeling, and introduces a recent contribution

level representation of domain-specific modeling knowledge.

to the area of automated process-based modeling, i.e., the

Such knowledge is embodied in a library of entity and process

ProBMoT [4, 10] platform. Section 3 depicts the task of

templates, which represent generalized modeling blueprints.

predictive process-based modeling of aquatic ecosystems.

The entity and process templates are further instantiated in

Section 4 describes the data used in the experiments, the

specific entities and processes that correspond to the

design of the experiments, and the task specification. Section

components and the interactions of the modeled system.

5 presents the results of the experiments. Finally, Section 6

These specific model components and interactions define the

discusses the findings of this paper and suggests directions

set of candidate model structures.

for future work.

The algorithm for inducing models employs knowledge-

based methods to enumerate all candidate structures. For each

2 PROCESS-BASED MODELING AND PROBMOT

obtained structure, a parameter estimation is performed using

the available training data. For this reason each structure is

The process-based modeling paradigm, addresses the task of

compiled into a system of differential and algebraic

learning process-based models of dynamic systems from two

equations, which allows for the model to be simulated. In

points of view: qualitative and quantitative. The first,

essence, this includes minimizing the discrepancy between

provides a conceptual representation of the structure of the

the values of the simulated behavior obtained using the model

modeled system. Still, this depiction does not provide enough

and the observed behavior of the system.

details that would allow for simulation of the system’s

Recent implementations of the PBM approach include

behavior. In contrast, the later, treats the process-based model

Lagrame2.0 [11], HIPM [12] and ProBMoT (Process-Based

as a set of differential and/or algebraic equations which

Modeling Tool) [4, 10], which is next described.

allows for simulation.

The Process-Based Modeling Tool (ProBMoT), is a

software platform for simulating, parameter fitting and

98



inducing process-based models. Figure 1 illustrates the

One the other hand, when learning from parallel data,

process of automated modeling with ProBMoT. The first

depicted in Figure 2b, ProBMoT takes as an input several

input to ProBMoT is a conceptual model of the modeled

short time-period training datasets. The parameter

system. The conceptual model specifies the expected logical

optimization algorithm handles the short time-periods in

structure of the modeled system in terms of entities and

parallel, i.e., it estimates the optimal model parameters by

processes that we observe in the system at hand. The second

minimizing the discrepancy between the simulated behavior

input is the library of domain-specific modeling knowledge.

and each individual training set.

By combining the conceptual model with the library of

ProBMoT offers wide range of objective functions for

plausible modeling choices, candidate model structures are

measuring model performance such as sum of squared errors

obtained.

(SSE) between the simulated values and observed data, mean

The model parameters for each structure are estimated

squared error (MSE), root mean squared error (RMSE),

using the available training data (third input to ProBMoT).

relative root mean squared error (ReRMSE), which is used in

The parameter optimization method is based on meta-

all experiments presented here for when learning the models

heuristic optimization framework jMetal 4.5 [13], in

and evaluating their performance. Relative root mean squared

particular, ProBMoT implements the Differential Evolution

error ( ReRMSE) [16] is defined as:

(DE) [14] optimization algorithm. For the purpose of

∑௡௧ୀ଴ሺݕ

simulation, each model is transformed to a system of ODEs,

ܴܴ݁ܯܵܧሺ݉ሻ ൌ	ඨ

௧ െ ݕො௧ሻଶ

(1)

௡

,

which are solved using CVODE ODE solver from the

∑௧ୀ ሺ

଴ ݕത െ ݕො௧ሻଶ

SUNDIALS suite [15].

where ݊ denotes the number of measurements in the test data

Finally, the last input, is a separate validation dataset. In

set, ݕො௧ and ݕ௧ correspond to the measured and predicted value

both cases (LSD and LPD), the model which has best

(obtained by simulating the model ݉) of the system variable

performance on the validation dataset is the output of

ݕ at time point ݐ, and ݕത denotes the mean value of the system

automated modeling process.

variable ݕ in the test data set.

The data on the aquatic systems are very sparse (e.g.

3 PREDICTIVE PROCESS-BASED MODELING OF

measure on a monthly basis). In the above mentioned studies,

AQUATIC ECOSYSTEMS

often they have been typically interpolated and sampled at a

ProBMoT has been used extensively to model aquatic

daily interval. Here, to assess the effect of the interpolation to

ecosystems [4, 5 and 6]. Most of the case-studies, however,

the performance of the models, we also consider using only

have focused on descriptive modeling – focusing on the

the original measured values when establishing the predictive

content/interpretation of the learned models and not on their

process-based model.

accuracy and predictive performance (with the exception of

[10]). Predominately, models have been learned from short

time-period (one-year) datasets, as considered long time-

periods worth of data resulted in models of poor fit. These

models, however, had poor predictive power when applied to

new (unseen) data.

We use ProBMoT to learn predictive models of aquatic

ecosystems from long time-period (multi-year) datasets.

ProBMoT supports predictive modeling, as the obtained

models can be applied/evaluated on a testing dataset. Taking

the input/exogenous variable values from the test dataset,

ProBMoT simulates the model at hand, and makes

predictions for the values of the output/endogenous (system)

variables. Using the output specifications, the values of the

Figure 2: Two approaches to predictive modeling. a)

output variables of the model are calculated and compared to

Learning from sequential data (LSD), and b) Learning from

the output variables from the test set, thus allowing for the

parallel data (LPD).

predictive performance of the model to be assessed.

Concerning the use of long time-period datasets,

4 EXPERIMENTAL SETUP

ProBMoT supports two different approaches, i.e., learning

from sequential data (LSD) and learning from parallel data

In this study, we apply the automated modeling tool

(LPD). The parameter optimization algorithm uses the

ProBMoT to the task of predictive modeling of

available training data from the observed system to estimate

phytoplankton dynamics in Lake Zürich, Switzerland. We

the numerical values of the parameters. When learning from

empirically evaluate the two different approaches for learning

sequential data, illustrated in Figure 2a, ProBMoT takes as an

predictive models, LSD and LPD, on this task. We apply

input one training dataset. The training dataset is comprised

those two on interpolated data (sampled daily) and on the

of several contiguous short time-period datasets, thus the

original (sparse) data.

parameters are estimated over the whole time-span.

99

4.1 Data & domain knowledge

dataset (year 1998) and the test dataset (year 1999) remain the

The datasets used for our experiments were obtained from the

same.

Water Supply Authority of Zürich. Lake Zürich is

a lake in Switzerland, extending southeast of the city

5 RESULTS

of Zürich. It has an average depth of 49 m, a volume of 3.9

Table 1 summarizes the performance comparison between

km3 and a surface area of 88.66 km2. The measurements

models learned from sequential data (LSD) and models

consist of physical, chemical and biological data for the

learned from parallel data (LPD), using both interpolated

period from 1992 to 1999, taken once a month at 19 different

(left-hand side) and original (right-hand side) training data.

sites, and averaged to the respective epilimnion (upper ten

Note that, in both cases, learning from sequential data, yields

meters) and hypilimnion (bottom ten meters) depths.

better predictive performance than learning from parallel

The data were interpolated with a cubic spline algorithm

data. The results of the Wilcoxon test (in Table 1 below)

and daily samples were taken from the interpolation [17].

shows that using LSD is better than using LPD, however, the

Both the original and interpolated data from the first six years

difference in performance is not substantial nor significant (p-

were used for training the models (1992-1997), data from

value=0.11).

year 1998 for validation and data from 1999 to estimate the

predictive performance of the learned models.

Table 1: Comparison of the predictive performances

The population dynamics model considered, consists of

(ReRMSE on test data) of models learned from sequential

one endogenous/output (system) variable and multiple

data (LSD) and models learned from parallel data (LPD),

exogenous/input variables structured within a single ODE.

from both interpolated and original samples. The numbers in

The phytoplankton biomass is represented as a system

bold indicate the best result for the given years.

variable, while the exogenous variables include: the

Train data

Interpolated

Original

concentration of zooplankton, dissolved inorganic nutrients

(years)

LSD

LPD

LSD

LPD

(nitrogen, phosphorus, and silica) and two environmental

influences of water temperature and global solar radiation

‘97

1.398

1.398

1.074

1.074

’96-’97

1.099

1.391

1.381

1.469

(light).

’95-’97

1.006

1.044

0.984

1.084

The library for process-based modeling of aquatic

’94-’97

0.986

1.094

1.004

1.112

ecosystems used in our experiments, is the one presented by

’93-’97

1.075

1.109

1.105

1.085

Atanasova [18]. Particularly, to reduce the computational

’92-’97

0.934

0.998

1.074

0.974

complexity of our experiments, we use a simplified version

LSD > LPD;

LSD > LPD;

of the library which results in total of 128 candidate models.

Wilcoxon test

p-value = 0.11

p-value = 0.11

4.2 ProBMoT parameter settings



Next, as shown in Table 1, using the original measured values

For the parameter calibration procedure we use Differential

when learning the models, did not improve their predictive

Evolution with rand/1/bin strategy, 1000 evaluations over a

performance.

population space of 50 individuals. For simulating the ODEs

Finally, most importantly, from both experiments

we use the CVODE simulator with absolute and relative

performed, we can conclude that using large amounts of

tolerances set to 10ିଷ. For measuring the model performance

training data (even interpolated) improves the overall

we use objective function ReRMSE, described in Section 3.

predictive performance of the learned process-based models.

To further assess the significance of the differences in

Note however, that for one case (’93-97) the performance of

performance between the single dataset approach and

the models does not improve. Further investigations are

multiple datasets approach we use Wilcoxon test for

required to determine whether this phenomena is due to the

statistical significance [19] as presented by Demšar [20].

quality of the data of that particular dataset (‘93), or to the

4.3 Experimental design

dynamics of the system at that particular period significantly

In this paper we compare the performance of the two different

differing from the rest.

approaches (LSD and LSP) to learning predictive process-

based models. For each approach we learn six process-based

6 CONCLUSION

models using the available training data of six successive

In this paper, we tackle the task of learning predictive

years (1992-1997). For both cases, we start with one short

interpretable process-based models of dynamic systems. In

time-period training dataset (year 1997), and continue for five

the process of establishing general and robust predictive

steps adding one preceding year to the training data set. At

models, we investigate learning from parallel data (LPD), in

each step we learn the process-based models accordingly to

contrast to the state-of-the-art approach of learning from

the two approaches described in the previous section.

sequential data (LSD). We apply the both approaches to the

First, we apply this two approaches on the interpolated

task of modeling phytoplankton dynamics in Lake Zürich,

data, or more precisely, daily samples of interpolated data.

using ProBMoT, a platform for simulating, parameter fitting

Second, we apply the two learning approaches to the original

and inducing process-based models. Additionally, besides

(sparse) training data. In all of the experiments the validation

validating the performance of the approaches to learning

100

predictive process-based models, we also test the quality of

References:

the training data by learning models from the original

[1] P. W. Langley, H. A. Simon, G. Bradshaw, J. M. Zytkow. Scientific

measured values, in contrast to learning models from daily

Discovery: Computational Explorations of the Creative Processes.

samples of interpolated data.

MA: The MIT Press, Cambridge, MA, USA. 1987.

The general conclusion of this paper is that using larger

[2] S. Džeroski, L. Todorovski. Learning population dynamics models

amounts of training data for learning process-based models

from data and domain knowledge. Ecological Modelling 170, pp. 129–

140. 2003.

yields improved predictive performance for tasks of modeling

[3] W. Bridewell, P. W. Langley, L. Todorovski, S. Džeroski. Inductive

aquatic ecosystems. Both, Atanasova et al [5] and Taškova et

process modeling. Machine Learning 71, pp. 1–32. 2008.

al. [6] clearly state that one-year datasets produce models

[4] D. Čerepnalkoski, K. Taškova, L. Todorovski, N. Atanasova, S.

Džeroski. The influence of parameter fitting methods on model

with poor predictive performance. We show that using data

structure selection in automated modeling of aquatic ecosystems.

from a longer period, considered either consequently (LSD)

Ecological Modelling 245 (0), pp. 136–165. 2012.

or parallel (LPD) helps in deriving more general models, and

[5] K. Taškova, J. Šilc, N. Atanasova, S. Džeroski. Parameter estimation

therefore, better predictive models.

in a nonlinear dynamic model of an aquatic ecosystem with meta-

heuristic optimization. Ecological Modelling 226, pp. 36–61. 2012.

Even though the statistical significance comparison shows

[6] N. Atanasova, F. Recknagel, L. Todorovski, S. Džeroski, B. Kompare.

that the LSD approach has better performance than the LPD

Computational assemblage of Ordinary Differential Equations for

approach, the difference in performance is neither substantial

Chlorophyll-a using a lake process equation library and measured data

nor significant. Nevertheless, when learning from sequential

of Lake Kasumigaura. In: Recknagel, F.(Ed.), Ecological Informatics.

data, due to the mater of simulation and parameter

Springer, pp. 409–427. 2006a.

[7] N. Atanasova, L. Todorovski, S. Džeroski, R. Remec, F. Recknagel,

optimization, the available training data considered for

B. Kompare. Automated modelling of a food web in Lake Bled using

learning process-based models should be contiguous. On the

measured data and a library of domain knowledge. Ecological

other hand, one useful feature of the LPD approach is that can

Modelling 194 (1-3), pp. 37–48. 2006c.

[8] P. Whigham, F. Recknagel, F. Predicting Chlorophyll-a in freshwater

handle missing data (e.g. intermediate period with no

lakes by hybridising process-based models and genetic algorithms.

measurements) for establishing robust process-based models.

Ecological Modelling 146 (13), pp. 243–251. 2001.

Our empirical evaluation of learning from the original

[9] W. Bridewell, N. B. Asadi, P. Langley, L. Todorovski. Reducing

uninterpolated and sampled interpolated data, showed that the

overfitting in process model induction. In: Proceedings of the 22nd

International Conference on Machine learning. (ICML ’05). ACM, pp.

interpolation does not affect the performance of the learned

81–88. 2005.

process-based models. On the contrary, the models learned

[10] N. Simidjievski, L. Todorovski, S. Džeroski. Learning ensembles of

using the interpolated values yielded better performance than

population dynamics models and their application to modelling aquatic

the ones learned using the original values. We conjecture that

ecosystems. Ecological Modelling (In Press). 2014.

this is due to the sparsity of the original measured values (~

[11] L. Todorovski, S. Džeroski. Integrating domain knowledge in equation

discovery. In: Džeroski, S., Todorovski, L. (Eds.), Computational

12 time-points per year), which is insufficient to capture the

Discovery of Scientific Knowledge. Vol. 4660 of Lecture Notes in

dynamics of such a system. Moreover, considering the

Computer Science. Springer Berlin, pp. 69–97. 2007.

relative performance between the two approaches, the LSD

[12] L. Todorovski, W. Bridewell, O. Shiran, P. W. Langley. Inducing

hierarchical process models in dynamic domains. In: Proceedings of

approach performed insignificantly better than the LPD

the 20th National Conference on Artificial Intelligence. AAAI Press,

approach

Pittsburgh, USA, pp. 892–897. 2005.

Taken all together, some new questions arise for further

[13] J. J. Durillo, A. J. Nebro. jMetal: A Java framework for multi-objective

investigation. How strongly the quality of measurements

optimization. Advances in Engineering Software 42, pp. 760–771.

2011.

affects the results? Would the results change significantly in

[14] R. Storn, K. Price. Differential Evolution – A simple and efficient

the case of ideal measurements? Considering this, possible

heuristic for global optimization over continuous spaces. Journal of

directions for further work are as follows. First, performing

Global Optimization 11 (4), pp. 341–359. 1997.

more experiments using multiple parallel sets of data from

[15] S. D. Cohen, A. C. Hindmarsh. CVODE, a stiff/nonstiff ODE solver in

different periods and, data from various different lake

C. Computers in Physics 10 (2), pp. 138–143. Mar. 1996.

[16] L. Breiman. Classification and Regression Trees. Chapman & Hall,

ecosystems should be used. In order to achieve more

London, UK. 1984.

controlled experiments, we consider testing the presented

[17] A. Dietzel, J. Mieleitner, S. Kardaetz, P. Reichert. Effects of changes

approaches on synthetic data, that is, data obtained by

in the driving forces on water quality and plankton dynamics in three

simulating a well-established model of an arbitrary aquatic

swiss lakes long-term simulations with BELAMO. Freshwater Biology

58 (1), pp. 10–35. 2013.

ecosystem. Finally, we would like to extend our approach to

[18] N. Atanasova., L. Todorovski, S. Džeroski, B. Kompare. Constructing

different ecosystems and other domains.

a library of domain knowledge for automated modelling of aquatic

ecosystems. Ecological Modelling 194 (13), pp. 14–36. 2006b.

[19] F. Wilcoxon. Individual comparisons by ranking methods. Biometrics,





Acknowledgements


1:80–83. 1945.

We would also like to acknowledge the support of the

[20] J. Demšar. Statistical comparisons of classifiers over multiple data sets.

Journal of Machine Learning Research 7, pp. 1–30. Dec. 2006.

European Commission through the project MAESTRA -



Learning from Massive, Incompletely annotated, and



Structured Data (grant number ICT-2013-612944).





101

RECOGNITION OF BUMBLEBEE SPECIES

BY THEIR BUZZING SOUND

Mukhiddin Yusupov1, Mitja Luštrek2, Janez Grad, Matjaž Gams2

Czech Technical University in Prague – Computer Science Department1

Jozef Stefan Institute – Intelligent Systems Department2

e-mail: yusupmuk@fel.cvut.cz, {mitja.lustrek, matjaz.gams}@ijs.si, janez.grad@siol.com

ABSTRACT

Although there are generalizations of the type of problem

we are solving here, such as a system for classifying many

The goal of our work is to help people to automatically

types of insects [4], relatively little has been done previously

identify the species and worker/queen type of

on automatic recognition on bumblebee species with a

bumblebee based on their recorded buzzing. Many

detailed analysis of their buzzing sound. Several internet

recent studies of insect and bird classification based on

applications provide sounds and images and images of

their sound have been published, but there is no

different species of birds, frogs etc., but they rely on human

thorough study that deals with the complex nature of

pattern-recognition skills to identify the species at hand and

buzzing sound characteristic of bumblebees. In this

do not provide active help. Our study is related to active

paper, a database of recorded buzzings of eleven species

system help in recognizing a particular (sub)species, and in

were preprocessed and segmented into a series of sound

particular to other audio data classification problems like

samples. Then we applied J48, MLP and SVM

classification of general audio content [8], auditory scene

supervised

classification

algorithms

on

some

recognition, music genre classification and also to the

predetermined sets of feature vectors. For five species

speech recognition, which have been studied relatively

the recognition rate was above 80% and for other six

extensively during last few years also in our department. We

species it was above 60%. At the end we consider how to

here try to take advantage from these previous studies.

further improve the results.



Some studies like [7], where they also tried to classify bee

1 INTRODUCTION

species, used different approaches. We can view these



systems as solving a pattern recognition problem. In [7] the

Bumblebees are important pollinators of many plants and

recognition of bee species is performed visually, based on

their colonies are now used extensively in greenhouse

its image. The task was to find relevant patterns from the

pollination of crops such as tomatoes and strawberries.

image and identify similarities to specific species. However,

Some bumblebee species are declining and it is a cause for

pictures vary a lot based on different factors, and often a

concern in Europe, North America and Asia. In Europe,

picture does not represent well what we see in nature. In our

around one quarter of species are threatened with extinction,

work the patterns are buzzing sound events produced by

according to recent studies. This is due to a number of

bumblebees. The chosen approach is recognition based on a

factors, including land clearing and agricultural practices.

parametric (feature) representation of the sound events.

There is a lot of research devoted to keep some bumblebee

Features should be selected so that they are able to

species from such decline.

maximally distinguish sounds that are produced by different



bumblebee species. Most of the recognition systems based

Until now over 250 species are known. There are about 19

on audio and especially human voice recognition uses Mel-

different species of bumblebee found in the UK, 68 in

frequency cepstrum coefficients (MFCC) as a feature vector.

Europe, 124 species in China, 24 in South America and 35

There are also works where feature vectors are Linear

in Slovenia. The colonies of bumblebees are composed of a

Predictive Coding coefficients (LPC) or a set of low-level

queen and many workers. Since only experts can identify

signal parameters like in [1].

the species by looking at or listening to them and their



sound, we decided to make this identification easy for all.

This paper uses MFCC and LPC to extract the features. For

One needs to record the buzzing and provide it to the system

the extraction of features and for other processing of audio

(program) that will process it and then tell to which species

records we used jAudio package [9]. Before feature

and worker/queen type this buzz corresponds to. The

extraction we preprocess and segment the sound recordings.

program is accessible from the homepage of the Department

We found that the segmentation is as important as the

of intelligent systems at the Jozef Stefan Institute -

extraction of features with a strong influence on the

http://dis.ijs.si/. More information can be provided from

prediction accuracy. Then we constructed the model

janez.grad@siol.com.

separately with three different classification algorithms: J48,



MLP and SVM. Training and evaluation of a model were

102





performed on a stored database of fifteen species of

However, spectral changes of signal parts are rather diverse

bumblebees. The experiments were carried out using

and detection of boundaries of such samples is difficult

WEKA open source machine learning software. Results

because adjacent samples of separate buzzings can overlap

show that SVM has better performance than other two

in time and frequency. Moreover, due to the starting point of

systems.

buzz is being slow it may occur below the background noise

level. In Figure 1 we can see the representation of the sound

2 PREPROCESSING

record of humilis queen. It is difficult to recognize there are

three separate relevant parts and everything in between with



Each sound record preprocessing consists of three steps:

low frequency components as not relevant.

normalization, pre-emphasis and segmentation. First the



normalization is applied to the record by dividing it with

In Figure 2 it is even more problematic to say where exactly

maximum value:

buzzing of the sylvarum worker starts and only in 20% of





~

the recording there is the buzzing signal we are interested in.

a = x ( i)/ max x ( i) ,



0 i n− 1



(1)

During the investigation of spectrum of each bumblebee





species we also found out that buzzing of the same species

x ( i)

~ x ( i)

where

is the original signal,

is the normalized

can vary based on the state of the bumblebee or during the

signal and n is the length of the signal.

buzzing of one species some other ones can join and as a



result we will have a combination of buzzes. Same species

After that pre-emphasis is performed in order to boost only

makes one buzz when for example it is working and has

the high-frequency components, while leaving the low-

some other different buzz when it is angry.

frequency components in their original state. This approach



is based on observations that sound data comes with a high

We have to take into account various factors in devising a

frequency and low magnitude whereas the parts of the

segmentation method, since unsuccessful separation of a

recording that we are not interested in (noise, gaps)

record would result in unsuitable candidate samples and

incorporate low frequency and much higher energy. The

subsequently parametric representation would be different

pre-emphasis factor α is computed as

than for real signal data. That is why for the current version





of our work we decided to segment the audio recordings

− 2 π F t





α = e





(2)

manually by an audio editor program so that we could see





the result of recognition based purely on real signal data. On

where t is the sampling period of the sound. The new sound

one hand, this decision of manual separation obliges us to

is then computed as:

use in a testing phase of the model only noisiness records



H ( z)= 1− α z−1

where most parts of the record consists of signal data. But





(3)

on the other hand we analyzed how recognition accuracy



The last step of the pre-processing is segmentation. In this

changes when we change the strategy for segmentation,

step we separate sound record into a number of samples

since by visually looking at the spectrum of record it is

which represent only the buzzing. Each sound record is 45

easier to segment it. In this current state of the work we

to 60 seconds long. Extracting features from the whole

segmented the recording manually into samples of 1-4

sound

record,

firstly,

increases

the

computational

seconds of length and the parts which have less than 1

complexity and, secondly, affects the accuracy of the

seconds of buzzing duration we combined with adjacent

recognition. We do not need to calculate the features for the

samples.

silent, noisy and other irrelevant parts of the record.



3 FEATURE EXTRACTION AND MODEL

CONSTRUCTION



For each sample segment we calculated MFCC and LPC

features. These are features that are mostly used in audio

based classification tasks. Samples are processed in a



Figure 1: Representation of audio record of humilis queen

window-by-window manner. The size and the overlapping

species in time domain

factor of windows are the two key parameters in feature





extraction in general in any audio/signal processing task. We

found that the window size of 2014 and the overlapping

factor of 30% gives us the feature vectors, which

subsequently resulted in the best recognition model. For

each window we have several MFCC or LPC values. It is

better to represent each window with one feature value by

Figure 2: Representation of audio record of sylvarum

aggregating all the values in a window, so we applied the

worker species in time domain

aggregation by computing the mean value for each window.

103

In this work we considered three classification algorithms

[4] Diego F. Silva, Vinícius M. A. de Souza, Eamonn

for building the model and these are the J48 Decision Tree,

Keogh, Daniel P. W. Ellis. Applying Machine Learning

MLP Neural Network and SVM algorithms. Models were

and Audio Analysis Techniques to Insect Recognition

built by these algorithms to classify among the 15 different

in Intelligent Traps.

bumblebee cases, most common in Slovenia, i.e. central

[5] Ruben

Gonzalez.

Better

than

MFCC

Audio

Europe. Classifying between a worker and a queen is not

Classification Features.

difficult and on average 90% for all species are easily

[6] J. F. Bueno, T. M. Francoy, V. L. Imperatriz-Fonseca,

identified, but we are more interested in knowing the exact

A. M. Saraiva. Modeling an automated system to

type of species like hortoum, hypnorum and pratorum in

identify and classify stingless bees using the wing

addition to the status of a bumblebee in colony. So for each

morphometry – A pattern recognition approach.

species we have two cases, either queen or worker,

[7] H. Fujisaki, S. Ohno. Analysis and Modeling of

altogether 15 classes. The fact that the number of records for

Fundemental Frequency Contour of English Utterances.

each class of species in our testing and training dataset are

Proc. EUROSPEECH’95. Vol. 2. pp. 634-637.

not evenly distributed caused slight inconvenience for us to

Philadelphia. 1996.

build a good model. Also, this is one of the reasons why we

[8] Erling Wold, Thom Blum, Douglas Keislar and James

have different rates of accuracy for all species. 5 of the

Wheaton. Content Based Classification, Search, and

bumble species we recognized with above 80% of accuracy

Retrieval of Audio.

and 2 of them had a rate of 95%. In Table 1 we provide the

[9] Daniel McEnnis, Cory McKay, Ichiro Fujinaga,

rates of recognition for each model built separately on

Philippe Depalle. jAudio: A feature extraction library.

MFCC and LPC feature values with the three algorithms.





LPC

MFCC

J48

56%

56%

MLP

56%

60%

SVM

57%

64%

Table 1: Evaluation of the rates of recognition accuracy for

each built model



In practical terms, when the system proposed three most-

probable classes, the accuracy rose to over 90% overall,

enabling users to distinguish between the three proposed

potential solutions visually. This is the way the system

works at the moment.

4 CONCLUSION

In future we want to make the segmentation step to separate

record of samples automatically in a system by

incorporating all we learned from recordings and patterns of

the 11 bumblebee species of both types. Also, we are going

to build model using HMM and deep learning, because in

many works related to audio classification HMM and deep

learning produce best results. Then we intend to compare its

result with the ones we obtained from SVM.

References:

[1] Seppo Fagerlund. Bird Species Recognition Using

Support Vector Machines. EURASIP Journal on

Advances in Signal Processing, Volume 2007, Article

ID 38637. doi:10.1155/2007/38637

[2] Zhu Leqing, Zhang Zhen. Insect sound recognition

based on SBC and HMM. 2010 International

Conference on Intelligent Computation Technology and

Automation.

[3] Forrest Briggs, Raviv Raich, and Xiaoli Z. Fern. Audio

Classification of Bird Species: a Statistical Manifold

Approach.

104



THE LAS VEGAS METHOD OF PARALLELIZATION



Bogdan Zavalnij

Institute of Mathematics and Informatics

University of Pecs

Ifjusag utja 6, H-7634 Pecs, Hungary

e-mail: bogdan@ttk.pte.hu





ABSTRACT

2 THE FAMILY OF MONTE CARLO METHODS



The family of the Monte Carlo methods can be

While the methods of parallelizing Monte Carlo

algorithms in engineering modeling very popular, these

distinguished by the nature of their error. We can speak

methods are of little use in discrete optimization

about two sided, one sided and zero sided error methods. In

problems. We propose that the variance of the Monte

the case of two sided errors we approximate the solution

Carlo method, the Las Vegas method can be used for

step by step. In the analysis of the method we can measure

these problems. We would like to outline the basic

the distance of the approximation from the real solution,

concept and present the algorithm working on a specific

and find that we can get closer and closer in each step. This

problem of finding the maximum clique.

method is mostly useful in engineering modeling of



problems with real number solutions. This two sided

1 INTRODUCTION

method can be programmed in parallel environment with

The Monte Carlo methods have been powerful tools in

ease, as the steps usually independent.

scientific and engineering modeling for the last half century

In the case of one sided error, which takes place mostly in

[1]. Their usage become even more intense in the era of

decision problems, in each step we either get a final

computers. The easiness of parallelization made these

solution, or get no answer. This is the idea behind many

methods useful in supercomputer environments as well. But

primality tests, where we can find the composite numbers,

apart from the original idea the more recent versions of

but get uncertain answer for primes. The algorithms make a

these methods, in which category the Las Vegas method is

few dozen steps, and the uncertainty of being wrong

falling [2], found little usage in algorithms, and even less in

decreases to minimum. These algorithms usually are very

parallel programs. The few exceptions are the primality

fast and need no parallelization.

tests and the quicksort algorithm, although some research

The last method, which is called the Las Vegas method, is

was made earlier in this field [7][8].

the case of zero sided error. The famous quicksort algorithm

The problem we concentrate on is the maximum clique

falls into this category. With these algorithms we always get

problem [4], although the concept described in this paper

the right answer – as the quicksort sorts the sequence in the

applies to other problems in the field of discrete

end –, but the running time of the algorithm can be

optimization as well. The maximum clique problem can be

described by a probability variable. In other words

formulated in the following way. Given a finite simple

sometimes the algorithm is very fast, and sometimes it can

graph G=(V,E), where V represents the nodes and E

be very slow. (Luckily the later case is very-very rare in the

represents the edges. We call Δ a clique of G if the set of

case of the quicksort.)

Formally, we call an algorithm a Las Vegas algorithm if for

vertices of Δ is subset of V; Δ is an induced subgraph of

G

a given problem instance the algorithm terminates returning

; and Δ is an all connected graph, thus all its nodes

a solution, and this solution is guaranteed to be a correct

connected to all the other nodes. We call Δ a maximum

solution; and for any given problem instance, the run-time

clique if no other clique of G is bigger than it. The

of the algorithm applied to this problem is a random

maximum clique problem is to find the size of a maximum

variable. [13]

clique, and it is a well known NP-hard problem. A

From this description it is clear, that the Monte Carlo

simplified variation of this problem is the k- clique problem,

method, on one hand, can be easily used for engineering

which is a problem of the NP-complete class. The question

problems as we are looking for real number answers with

is, that if given a graph G, and a positive integer k, is there a

certain correctness. On the other hand, in the case of

clique of size k in the graph. To answer the question we

discrete and combinatorial optimization problems we

either must present a k-clique of the graph, or either prove

usually need exact answers, so the Las Vegas method can

that there is none in the graph.

prove itself of more use.





105

3 PARALLELIZATION WITH THE AIM OF THE

can help other instances to solve their subproblems faster.

LAS VEGAS METHOD

This method resembles the BlackBoard technique known

The variance in the running time of a Las Vegas algorithm

well in the field of Artificial Intelligence.

led Truchet, Richoux and Codognet to implement an

This approach can be used to parallelize several different

interesting way of parallelization the algorithms for some

discrete optimization algorithms. Namely, we can use it in

NP-complete discrete optimization problems [13]. The

any Branch-and-Bound technique instead of the branching

authors note that the algorithm implementation for those

rule. As it happens at a branching we have the problem of

problems heavily depends on the "starting point" of the

choosing the sequence of the branches. The speed of the

algorithm, as it starts from a random incorrect solution and

algorithm heavily depends on this sequence, as the result in

constantly changes it to find a real solution. Depending on

one branch may help us in an other branch – as a new,

the starting incorrect solution the convergence of the

better bound for example.

algorithm may be very fast or slow as well. The idea behind



the Las Vegas parallel algorithm was to start several

4 AN APPLICATION

instances of the sequential algorithm from different starting

In order to demonstrate the described method we choose a

points and let them run independently. The first instance

more simple algorithm than a general Branch-and-Bound.

which finds the solution shuts down all the other instances

Instead we used an algorithm from Sandor Szabo [10],

and the parallel algorithm terminates. As the running time

which answers the k-clique problem by dividing the original

of the different instances vary, some will terminate faster,

problem into thousands of subproblems. These subproblems

thus ending the procedure in shorter time. The article

then can be processed parallely with a sequential program.

describes the connection of the variance of the running

Obviously this algorithm needs proper number of

times and the possible speed-up when using k instances and

subproblems in order to achieve proper speed-ups, which

found that for some problems a linear speedup could be

this algorithms achieves well. The proposal starts with a

achieved.

quasi coloring with k-1 colors, and then examines each

This approach can be useful in several ways. For example

disturbing edge, whether that edge can be an edge of a k-

one can use different solvers for a given problem, and/or

clique. If yes, than we found a positive solution, if no, then

use different preconditioning techniques. Starting these

the edge can be deleted from the graph. After all the

solvers concurrently will lead the most suitable one to finish

disturbing edges are deleted, we get a proper k-1 coloring,

in the shortest time, thus leading us to a fast solution. (Note,

which forbids the k-clique, thus we solved the problem. I

that for different problems different versions of the solvers

have implemented this proposal and measured the running

may be the fastest.)

times for several different problems [14].

While the previous approach is simple and extremely

The measurements compares three version of the algorithm.

elegant as well, it lacks something. First, the different

In the first there is no information given from one

instances cannot help each other in finding the solution.

subproblem to another to help it in the solution. The

Second, each of the instances trying to solve the whole

program instances run totally independently. In the second I

problem and no division into subproblems appears in this

constructed a sequence where the helping information is the

proposal.

consequence of this sequential ordering, thus the help given

I propose a different approach, which includes these

in advance. This means that we can delete the edges in the

notions. If we divide an NP-hard problem into parts, then

sequence of the subproblems in advance, proposing that no

the arising subproblems falls into the same category as

k-clique can contain them. There is no actual

described: these are also NP-hard problems, and have great

communication between the program instances and they

variance in solution time. But we have a problem of

also run independently. These two versions are detailed in

constructing the sequence of the divided subproblems. As a

the paper of Sandor Szabo [10]. The third version is the Las

solution of one subproblem can be helpful in the solution

Vegas method, where the program instances starts parallely,

for the other the sequence of these subproblems have great

and when one is finished, this information is given to others

importance: we would like to solve the easier first to help

thus speeding up their solution time as the subproblems can

the more complex ones later. Here we can use some

be reduced with the aim of this information. In our case if

heuristics, but more often we proceed in the order the

the algorithm for a given subproblem reports that there is no

subproblems are already given, which leads to an inefficient

k-clique that contains that edge, then we delete this

algorithm.

particular edge from all the subproblems including those

Instead we can use the proposed Las Vegas technique

that are already running. For this purpose we obviously

starting the instances of the solver for the arisen

need a sequential clique search program where an edge can

subproblems parallely. Thus we may overcome the question

be deleted during the runtime.

of the sequence construction. As we seen, some problems



will run much faster and terminate with the desired answer.

5 RESULTS

These answers the can be feed to the other instances and

I used three sets of graph problems. The first set is consists

help them to solve their subproblem faster. This way each

of random graph with given probability of the edges. The

instance solving a partial problem instead of the whole, and

second set is taken from the DIMACS challenge website



106

[5][6]. In the third set two extremely hard problem

Table 2: Problems from the DIMACS challenge

represented, one is from coding theory [3][12], the other is

brock latin_square keller5 MANN

p_hat

p_hat

from combinatorial optimization [11]. For all problem we



800_3

_10

_a45

1500-1

500-3

know the clique size, so I run the algorithm to prove that

N

800

900

776

1035

1500

500

there is no clique bigger by one than the known clique

%

65

76

75

99

25

75

number. This step is important, because finding the

clique

25

90

27

345

12

50

maximum clique depends only on luck, thus shows little

parts

4888

380

420

45

14918

657

about the goodness of an algorithm. While for proving that

seq

7302

4902

4531

3666

278

*

there is no clique which is bigger by one as the known one

5-nopt

*

*

*

1340

894

*

needs extended search through the whole search tree, and

5-opt

*

1423

*

719

814

*

thus provides a good comparison for different algorithms

5-lv

*

1504

*

1051

824

*

and implementations.

16-nopt

*

531

986

402

247

*

The tables show the name of a problem instance, the size

16-opt

*

403

672

205

225

*

(N), the density (%), the maximum clique size (clique) and

16-lv

*

430

686

388

232

*

the running time for the sequential algorithm on the same

64-nopt

472

150

318

183

60

*

computer (seq). I also noted the number of subproblems that

64-opt

413

105

173

140

54

11k

arise in the algorithm (parts).

64-lv

425

128

228

174

56

7165

The tests were run on 4+1, 16, 64 and up to 512 processor

512-nopt

64

82

138

183

9

41k

cores (one core doing the distribution and not taking part in

512-opt

55

62

137

140

8

11k

calculation itself), and I show the running times in seconds

512-lv

59

83

138

183

8

5300

for those core numbers. The “noopt” results are from the

first version with no help between the problems, the “opt”

Table 3: Problems of monotonic matrices and deletion

stands for the more optimal version with help from one

codes

instance to an other by the original sequence, and the “lv”

represents the Las Vegas method of parallelization. If the



monoton-7 monoton-8 monoton-9 deletion-9

running time exceeded the time limit the table continues no

N

343

512

729

512

data (*). The produced results seems to prove the idea

%

79

82

84

93

interesting. The running times of the third algorithm in most

clique

19

23

28

52

of the cases were close to the running times of the second

parts

313

590

932

375

algorithm with help to other subproblems, which is an

interesting fact by itself. But even more interesting, that for

seq

7

2347

*

*

some cases it surpassed the second algorithm. These were

5-nopt

76

*

-

-

the most difficult cases, thus this method could perhaps be

5-opt

74

1282

-

-

useful for the solution of the most difficult problems.

5-lv

74

1292

-

-

Table 1: Random graph problems

16-nopt

23

959

-

-



16-opt

21

408

-

-

N

200 300

500

500

500

1000

1000 1000

16-lv

22

385

-

-

%

90

80

60

70

80

40

50

60

64-nopt

8

475

150k

-

clique

40

29

17

22

32

12

15

20

64-opt

6

409

150k

-

parts

152 540 2478 2231 1664 10918 10955 9823

64-lv

6

195

44k

-

seq

623 898

67 3453

*

136

447

15k

512-nopt

4

405

150k

*

5-nopt

376 466

431

*

*

1268

*

*

512-opt

2

409

150k

*

5-opt

109 231

420 1401

*

1156

*

*

512-lv

4

243

31k

255k



5-lv

126 242

424 1444

*

1168

*

*





ACKNOWLEDGEMENTS


16-nopt


123 135

119

584

*

350

*

*

Author would like to thank the HPC Europe grant for the

16-opt

33

64

116

387

*

319

*

*

fruitful visit to Helsinki, to the Finish Computer Science

16-lv

66

71

118

407

*

329

*

*

Center which hosts the supercomputer Sisu on which the

64-nopt

49

45

29

142

*

84

368 1236

computations was performed.



64-opt

27

16

28

93

18k

76

345 1064

64-lv

39

23

29

99

21k

79

355 1100

512-nopt

38

23

4

25

14k

11

48

158

512-opt

27

15

4

14 6595

10

45

135

512-lv

39

23

4

19 4189

10

46

142



107

Refernces

[8] Luby, M., Sinclair, A. and Zuckerman, D. Optimal

Speedup of Las Vegas Algorithms. In: Proceedings of

[1] Hammersley, J.M. and Handscomb, D.C. Monte Carlo

the 2nd Israel Symposium on Theory of Computing and

Methods. London. 1975. (1964.)

Systems. Jerusalem, Israel, June 1993.

[2] Babai, L. Monte-Carlo algorithms in graph isomorphism

[9] Ostergaard, P.R.J. A fast algorithm for the maximum

testing. Université de Montréal, D.M.S. No.79–10.

clique problem. Discrete Applied Mathematics. (2002),

http://people.cs.uchicago.edu/~laci/lasvegas79.pdf

197–207.

[3] Bogdanova, G.T., Brouwer, A.E., Kapralov, S.N. and

[10] Szabo, S. Parallel algorithms for finding cliques in a

Ostergaard, P.R.J. Error-Correcting Codes over an

graph. Journal of Physics: Conference Series Volume

Alphabet of Four Elements. Designs, Codes and Cryp-

268, Number 1.

tography.

2011 J. Phys.: Conf. Ser. 268 012030

August 2001, Volume 23, Issue 3, pp 333–

doi:10.1088/1742-6596/268/1/012030

342.

[11] Szabo, S. Monotonic matrices and clique search in

[4] Bomze, I.M., Budinich, M., Pardalos, P.M. and Pelillo,

graphs. Annales Univ. Sci. Budapest., Sect. Comp.

M. (1999). The Maximum Clique Problem. In D.-Z. Du

(2013), 307–322.

and P.M. Pardalos (Eds.) Handbook of Combinatorial

Optimization.

[12] Sloan, N. http://neilsloane.com/doc/graphs.html (May

(pp. 1–74.) Kluwer Academic Publishers.

30, 2014)

[5] DIMACS. ftp://dimacs.rutgers.edu/pub/challenge/graph/

[13] Truchet, C., et al. Prediction of Parallel Speed-ups for

(May 30, 2014)

Las Vegas Algorithms. http://arxiv.org/abs/1212.4287

[6] Hasselberg, J., Pardalos, P.M. and Vairaktarakis, G.

2012.

Test case generators and computational results for the

[14]

Zavalnij, B. Three Versions of Clique Search Paral-

maximum clique problem. Journal of Global Optimiza-

tion.

lelization.

Journal of Computer Science and

(1993), 463–482.

Information Technology. June 2014, Vol. 2, No. 2, pp.

[7] Luby, M. and Ertel, W. Optimal Parallelization of Las

09–20.

Vegas Algorithms. In Enjalbert, P at all. (Eds.), Lecture

Notes in Computer Science. (1994). (pp. 461–474.)



Springer Berlin Heidelberg.





108

RESOURCE-DEMAND MANAGEMENT IN A SMART CITY

Jernej Zupančič1, Damjan Kužnar1, Boštjan Kaluža1, Matjaž Gams1

1Department of Intelligent Systems, Jožef Stefan Institute, Jamova cesta 39, 1000 Ljubljana

e-mail: jernej.zupancic@ijs.si

ABSTRACT

the case. Truth-incentive demand management was proposed

in [4]. The mechanism described in [4] has several desirable

Due to the increasing demand and limited amount of nat-

properties besides being truth-incentive: it is proved to con-

ural resources the costs of resource supply is increasing.

verge to a Nash equilibrium, it is budget balanced (the total

Reasons for the increasing demand are also the increas-

cost of resource provision equals the total cost the consumers

ing needs of urban life in the city and the lack of mech-

pay), and it reaches the global optimum (minimal peak-to-

anism that would encourage proper resource consump-

average ratio). However, it still raises some issues: the re-

tion. In the paper we present a hierarchical architecture

source consumption of every consumer is known to everyone,

for resource-demand management in Smart city. The pro-

consumers are only encouraged to shift their load to different

posed architecture enables distributed computing and ro-

parts of the day and not to reduce consumption, smaller con-

bust resource-demand management. Further, we present

sumption does not result in smaller per-unit price, and real-

a two-stage mechanism that encourages the reduction of

time pricing requires price prediction capabilities.

resource consumption. In first stage it ensures that all the

In this paper we give a short presentation of a mechanism

consumers are satisfied with the resource per-unit price

[7] that addresses some of those shortcomings. It changes

and in the second stage rewards are offered to consumers

the prices dynamically and adapt them to each consumer in-

who are prepared to further lower their consumption.

dividually, it is truth-incentive, it encourages lower resource

consumption and it preserves privacy of the consumer. Fur-

1





INTRODUCTION


ther, we present the architecture that enables the application

of the proposed mechanism.

The demand of resources such as electricity, water, natural gas

The rest of the paper is structured as follows. In section

and oil is on the rise. Together with limited amount of natural

2 we present the envisaged architecture for resource-demand

resources they contribute to the increasing costs of resource

management in Smart city. In section 3 we give general de-

supply [1]. Consumers usually pay the same per-unit price

scription of demand-management mechanisms and in section

for a resource, although large consumers are responsible for

4 the negotiation protocol that encourages consumption re-

the rising cost. With the prevalence of metering technology

duction of convexly priced resources is presented. Section 5

and increasing sustainability awareness of consumers differ-

summarizes and concludes the paper.

ent levels of consumption are urged to be priced differently.

Using proper mechanism one could lower resource de-

2

ARCHITECTURE

FOR

RESOURCE-DEMAND

mand peaks that are usually expensive (additional less effi-

MANAGEMENT IN SMART CITY

cient means of resource production have to be enabled) by

shifting the consumption of the resource to a part of the day

Every city is hierarchically structured into districts then fur-

when the resource is in low demand. This way resource can

ther into streets and then even further into individual build-

be used much more efficiently and investment into resource

ings with devices and appliances. Therefore, it is only natu-

production units that are required to meet peak demands can

ral that resource-demand management architecture acknowl-

be postponed.

edges this hierarchical structure. Although hierarchical struc-

A resource-demand management mechanism that sets the

ture has some disadvantages against star formation (Figure 1)

same price for every consumer in advance does not fully

that is typically used for resource-demand management (hi-

exploit demand-management capabilities. Dynamic-pricing

erarchical structure requires more communication nodes and

mechanisms (the prices change every so often) were already

more messages to be transmitted, which could result in slower

proposed (for instance in [6]). Most of such mechanisms

response) it also possesses some desirable properties: it is dis-

require the consumers either to report their utility function

tributed, it is more resistant against failures and it can better

(which raises privacy concerns) or to leave the decision about

represent the reality and real world decision making.

consumption to the consumers themselves (which is not as

Decision nodes in the architecture have to communicate

efficient as it could be). A dynamic-pricing mechanism that

among themselves, they need some computational ability and

negotiates prices with every individual was proposed in [3].

they have to take into account real-world decision maker pref-

However, their approach has to assume that the consumers

erences when taking decision by themselves. They are agents

will report their consumption truthfully, which is not always

forming a multi-agent system. Hierarchical organizational

109

House1

Street

House12

6

Street5

House5

House2

Street1

District1

House

District

11

2

House6

House4

City

House3

House10

Street2

City

House4

Street3

Street4

House1

House3

House5

House15

House6

House2

House

House

7

House

13

House14

8

House9

Figure 1: Hierarchical structure on the right and star formation on the left

structure of multi-agent system can be described as follows.

All mentioned mechanism properties are preferred in a

At the root of the structure is the top decision entity in the

mechanism that is to be implemented in reality.

city and it is responsible for setting the cost of resource provi-

Mechanisms can be further divided according to the length

sion and global price computation of the resource. We call the

of the time period for which they determine consumption. Re-

root node a Resource negotiator agent. At lower levels there

source consumption over the day can be divided into finitely

are Aggregation agents that propagate the price information

many time intervals. A dynamic mechanism can determine

from higher nodes to lower nodes and resource consumption

the resource consumption for the next short time interval (on

information from lower to higher nodes. They could be in-

the scale of an hour), in that case the price and consump-

dependent since every district could have its own policies on

tion are set dynamically every time a new time interval ap-

resource consumption. At the lowest level there are House

proaches. This type of mechanism is used when it is difficult

coordinator agents that can operate a group of appliances and

to determine the cost of resource production and distribution

devices. Every parent node can either negotiate with its child

far in advance. With the dynamic mechanism the agents can-

node/agent (when child agent does not reveal its information)

not schedule the operation of the devices or appliances for the

or optimize it (when child agent reveals its information and

whole day. The resource demand dynamically matches the

allows the parent agent to control it).

price of the resource.

A scheduling mechanism can determine the price and the

3

RESOURCE PRICING MECHANISM

resource consumption for every time interval for the whole

day. Appliances operation is scheduled in a way that opti-

In previous section we presented an underlying architecture

mizes the cost or energy efficiency and meets the require-

that defines the network of nodes and connections between

ments of the user. When using a scheduling mechanism a

them. In this section we will give some general information

resource negotiator needs a good knowledge of the resource

about mechanisms or protocols. Mechanisms define how the

provision cost for every time interval for the whole day. Un-

communication is carried out over the proposed architecture,

expected events can greatly disturb the schedule set by the

what is the content of communication and what are the rules

agents.

for the interaction between the nodes. We can classify mech-

anisms according to their properties.

4

A DYNAMIC NEGOTIATION MECHANISM FOR

A mechanism is strategy proof or truthful when the best

CONVEXLY PRICED RESOURCES

the agents can do when the mechanism is in action, is to tell

the truth. There is no incentive for any of the agent to lie about

In this section we give a quick overview of a negotiation

its information.

mechanism that encourages the reduction of resource con-

A mechanism converges to some final distribution of

sumption of convexly priced resource. A more detailed de-

prices and consumptions in finite number of steps. Conver-

scription can be found in [7]. The mechanism consists of

gence property guarantees that the mechanism ends, however,

two stages: a negotiation stage and a renunciation stage. In

the speed of convergence is important as well and should be

the first stage the goal is to reduce resource consumption to a

addressed.

level where every consumer is satisfied with the price it has

A mechanism is budget-balanced when the cost of re-

to pay. In second stage rewards are offered to consumers that

source provision to the consumers should be the same as the

are prepared to further lower their consumption.

total cost the consumers pay for the consumed resource. No

In negotiation stage many rounds take place with con-

agent can benefit from the mechanism, be it a resource pro-

sumers reporting their desired consumption to the negotia-

ducer or resource consumer.

tor and negotiator computing prices for every consumer. The

110

prices are computed individually for every consumer using se-

The protocol is strategy-proof when a consumer is risk-

rial cost sharing mechanism [5]. The serial cost sharing mech-

avert. A consumer could insist on a large desired consump-

anism determines fair price for every consumer – lower con-

tion in the negotiation stage and then try to obtain large reward

sumption results in lower price. Further, since the resource is

when reducing the resource consumption in the second stage.

convexly priced it also results in lower per-unit pricing.

However, rewards are first given to the small consumers and

Individual price is computed by a function price(i, f, c)

those rewards are also the largest, since the pricing of the re-

that takes as inputs the following parameters: consumer i,

source is convex. Further, the rewards are adjusted so that the

a sorted consumption vector c and resource cost function f .

consumer with a higher demand after the negotiation stage

Function price is defined recursively in Eq. (1).

cannot obtain a lower per-unit price in the renunciation stage.

Therefore, by lying an agent risks getting a price it is not pre-

price(0, f, c) = 0

pared to pay. Truth-telling is therefore the best strategy for a

price(i, f, c) = price(i − 1, f, c)+

(1)

risk-averse agent.

Consumers are motivated to reduce resource consump-

f

i−1 c[j] + (N + 1 − i) · c[i] − f

i−1 c[j]

j=1

j=1

tion by the protocol itself. Due to serial cost sharing mecha-

+

,

N + 1 − i

nism, which is used for pricing in the negotiation stage, lower

demand results in a lower resource price. Further, since the

where N is the number of consumers participating in negoti-

resource is convexly priced, per-unit price is reduced as well.

ation.

The protocol is budget-balanced.

Serial cost sharing

If any of the consumers does not agree with the price it

mechanism defines prices in a way that the cost of resource

receives and wants to further reduce its consumption antic-

supply equals to the total cost the consumers have to pay. In

ipating a lower per-unit price, another round of negotiation

the renunciation stage, when a consumer reduces its resource

takes place. In the following round the desired consumption

consumption, a budget imbalance (a surplus) occurs in the fa-

of individual consumer can be the same or lower. Negotiation

vor of the negotiator (seller of the resource), since larger con-

stage ends when the demand is the same for every consumer

sumers pay a higher price than needed to obtain the desired

in two consecutive rounds.

amount of the resource. The surplus is then offered to the

In renunciation stage the consumers that further reduce

consumer who reduced its consumption and sometimes even

their consumption are rewarded under the condition that the

to the consumers with lower consumption, due to the per-unit

reward they are offered is sufficient to compensate their con-

price equalization. Therefore, no surplus is generated at the

sumption reduction. There is only one round in renunciation

end of the renunciation stage.

stage. Consumers are addressed one by one starting with the

consumer with the lowest demand. Algorithm for the renun-

ciation stage outputs new prices while taking into account fur-

4.2

Experiments

ther reductions offered by consumers and rewards demanded.

Negotiator computes the reduction in the resource total supply

We tested the negotiation mechanism on the multi-agent sys-

cost (the cost is lowered due to lower demand), which could

tem implemented in JADE (Java Agent DEvelopment) frame-

be offered as a reward to a consumer. To ensure that the con-

work [2]. We implemented a resource negotiator agent and

sumers who had lower resource demand after the negotiation

house representative agents in star formation. Each house

stage receive lower final per-unit price we may have to adjust

representative agent possesses the information about the elec-

the reward. If the consumer agrees with the reward it lowers

trical energy consumption of the devices and the information

its consumption and receives a discount.

about consumer preferences (maximal per-unit price for op-

erating each device and required reward for not operating the

device). This information is private to the house representa-

4.1

Mechanism properties

tive agent and sent neither to resource negotiator nor to other

The presented mechanism has several desirable properties. In

house representative agents. Linear piecewise function was

this section we will list them and present the sketches of the

chosen for convex resource cost function.

proofs.

In typical simulation run reduction of energy consump-

Negotiations converge in a finite number of steps. The

tion level can be observed (Figure 2). In the first round of

renunciation stage ends in one round, therefore, we only have

the negotiation (up to the dashed line), every house represen-

to show that the negotiation stage of the protocol ends in a

tative achieves the price it is willing to pay. A further re-

finite number of rounds. Since there is a finite number of con-

duction in resource consumption occurs in the renunciation

sumers, and every consumer has a finite number of appliances

stage (between the dashed and the solid line), where per-unit

and devices, and every appliance or device has a finite num-

price reduces as well for low consuming houses H1 and H2.

ber of operating states, there ia a finite number of consump-

Large consuming house H3 does not receive the reward large

tion level combinations. Since the consumers cannot increase

enough to further reduce its consumption.

their desired consumption in two consecutive rounds, a rep-

In scalability test we gradually increased the number of

etition of demand will surely occur. Therefore, negotiation

agents involved in the experiment up to 100 000. Linear scal-

will end.

ability of the mechanism is observed (Figure 3).

111



Ministry of Economic Development and Technology.





14

12



10



8



6



Protocol execution time [s]

4



2





00

20000

40000

60000

80000

100000

Number of agents

Figure 2: Cumulative consumption observed during negotia-

tion, with final prices added

Figure 3: Scatter plot for the scalability experiment

References

5

CONCLUSION

[1] E. B. Barbier. Economics, natural-resource scarcity and

In the paper we presented a hierarchical architecture for

development: conventional and alternative views. Rout-

resource-demand management in smart cities, which is a nat-

ledge, 2013.

ural representation of the city and enables distributed compu-

tation and robust control. We defined classes of the mech-

[2] F. L. Bellifemine, G. Caire, D. Greenwood. Develop-

anisms that can be applied to the proposed architecture. In

ing multi-agent systems with JADE. John Wiley & Sons,

second part of the paper we presented a dynamic mechanism

2007.

that encourages the reduction of resource consumption when

[3] F. Brazier, F. Cornelissen, R. Gustavsson, C. M. Jonker,

resource cost function in convex. It is a two-stage mecha-

O. Lindeberg, B. Polak, J. Treur. Agents negotiating for

nism that ensures consumer satisfaction in the first stage when

load balancing of electricity use. Distributed Computing

consumer is truthful and encourages further reduction of con-

Systems, 1998. Proceedings. 18th International Confer-

sumption in second stage by offering rewards. The mecha-

ence on, pp. 622–629, 1998.

nism has several desirable properties: it is budget-balanced,

converges in finite number of steps, it is strategy proof and

[4] A-H. Mohsenian-Rad, V. W. S. Wong, J. Jatskevich,

scales linearly with the number of agents that participate in

R. Schober, A. Leon-Garcia Autonomous demand-side

the mechanism.

management based on game-theoretic energy consump-

Further work will include the analysis and modelling of the

tion scheduling for the future smart grid. Smart Grid,

consumer behaviour. The mechanism will then be applied to

IEEE Transactions on 1(3), pp. 320–331, 2010.

the real world models. The goal of this research is to provide

[5] H. Moulin, S. Shenker. Serial cost sharing. Economet-

a modular mechanism that will incorporate dynamic demand-

rica: Journal of the Econometric Society, pp. 1009–1037,

response together with scheduling. Further, it will be able to

1992.

deal with: different types of architectures, agents that have

hidden information, and agents that reveal all their informa-

[6] P.

Samadi,

H.

Mohsenian-Rad,

R.

Schober,

tion. That kind of mechanism will combine optimization and

V. W. S. Wong

Advanced demand side manage-

negotiation in an efficient and universal way.

ment for the future smart grid using mechanism design.

Smart Grid, IEEE Transactions on 3(3), pp. 1170–1180,

2012.





Acknowledgements


[7] J. Zupančič, D. Kužnar, B. Kaluža, M. Gams.

Two-

We thank Gregor Grasselli, Matej Krebelj and Jure Šorn for

stage negotiation protocol for lowering the consumption

the help with the implementation of the experiments in JADE

of convexly priced resources. IAT4SIS ’14, Workshop

environment. The research was sponsored by ARTEMIS Joint

proceedings on, to be published, 2014.

Undertaking, Grant agreement nr. 333020, and Slovenian

112





Indeks avtorjev / Author index



Ambrožič Nejc ............................................................................................................................................................................. 74

Aydın Cevat ................................................................................................................................................................................. 78

Bohanec Marko ............................................................................................................................................................................ 62

Bosnić Zoran ................................................................................................................................................................................ 18

Brence Jure ..................................................................................................................................................................................... 5

Černe Matija ................................................................................................................................................................................... 9

Cvetković Božidara ...................................................................................................................................................................... 14

Demšar Jaka ................................................................................................................................................................................. 18

Dovgan Erik ................................................................................................................................................................................. 22

Džeroski Sašo ............................................................................................................................................................................... 97

Filipič Bogdan .................................................................................................................................................................. 22, 66, 74

Frešer Martin ................................................................................................................................................................................ 26

Gams Matjaž ............................................................................................................................................................................... 34

Gams Matjaž .............................................................................................................................. 5, 30, 42, 46, 70, 74, 93, 102, 109

Gantar Klemen ............................................................................................................................................................................. 22

Gjoreski Hristijan ................................................................................................................................................................... 34, 38

Gjoreski Martin ............................................................................................................................................................................ 38

Gosar Žiga ...................................................................................................................................................................................... 5

Grad Janez ................................................................................................................................................................................. 102

Gradišek Anton ............................................................................................................................................................................ 42

Jovan Leon Noe ............................................................................................................................................................................ 46

Kaluža Boštjan ........................................................................................................................................................... 9, 50, 85, 109

Kerkhoff Rutger ........................................................................................................................................................................... 50

Koblar Valentin ............................................................................................................................................................................ 22

Konecki Mario ............................................................................................................................................................................. 54

Kononenko Igor ........................................................................................................................................................................... 18

Košir Igor ..................................................................................................................................................................................... 26

Kulakov Andrea ........................................................................................................................................................................... 38

Kužnar Damjan .................................................................................................................................................................... 46, 109

Luštrek Mitja ...................................................................................................................................... 9, 14, 26, 34, 42, 58, 70, 102

Madevska Bogdanova Ana ........................................................................................................................................................... 89

Martinčič Ipšić Sandra ................................................................................................................................................................. 70

Mihelčić Matej ............................................................................................................................................................................. 62

Mirchevska Violeta ...................................................................................................................................................................... 26

Mlakar Miha ................................................................................................................................................................................. 66

Nikič Svetlana .............................................................................................................................................................................. 46

Piltaver Rok ............................................................................................................................................................................ 70, 74

Sabancı Kadir ............................................................................................................................................................................... 78

Šef Tomaž ........................................................................................................................................................................ 74, 81, 93

Seražin Vid ..................................................................................................................................................................................... 5

Simidjievski Nikola ...................................................................................................................................................................... 97

Slapničar Gašper .......................................................................................................................................................................... 85

Somrak Maja .......................................................................................................................................................................... 42, 58

Šorn Jure....................................................................................................................................................................................... 93

Tashkoski Martin.......................................................................................................................................................................... 89

Tavčar Aleš ...................................................................................................................................................................... 50, 74, 93

Tušar Tea .......................................................................................................................................................................... 66, 74, 93

Vidmar Nina ................................................................................................................................................................................. 97

Yusupov Mukhiddin ................................................................................................................................................................... 102

Zavalnij Bogdan ......................................................................................................................................................................... 105

Zupančič Jernej ...................................................................................................................................................................... 5, 109





113



114





Konferenca / Conference

Uredili / Edited by

Inteligentni sistemi /

Intelligent Systems

Rok Piltaver, Matjaž Gams