Zbornik 21. mednarodne multikonference INFORMACIJSKA DRUŽBA - IS 2018 Zvezek E Proceedings of the 21st International Multiconference INFORMATION SOCIETY - IS 2018 Volume E Delavnica AS-IT-IC AS-IT-IC Workshop Uredila / Edited by Matjaž Gams, Jernej Zupančič http://is.ijs.si 8.–12. oktober 2018 / 8–12 October 2018 Ljubljana, Slovenia Zbornik 21. mednarodne multikonference INFORMACIJSKA DRUŽBA – IS 2018 Zvezek E Proceedings of the 21st International Multiconference INFORMATION SOCIETY – IS 2018 Volume E Delavnica AS-IT-IC AS-IT-IC Workshop Uredila / Edited by Matjaž Gams, Jernej Zupančič http://is.ijs.si 8.–12. oktober 2018 / 8–12 October 2018 Ljubljana, Slovenia Urednika: Matjaž Gams Odsek za inteligentne sisteme Institut »Jožef Stefan«, Ljubljana Jernej Zupančič Odsek za inteligentne sisteme Institut »Jožef Stefan«, Ljubljana Založnik: Institut »Jožef Stefan«, Ljubljana Priprava zbornika: Mitja Lasič, Vesna Lasič, Lana Zemljak Oblikovanje naslovnice: Vesna Lasič Dostop do e-publikacije: http://library.ijs.si/Stacks/Proceedings/InformationSociety Ljubljana, oktober 2018 Informacijska družba ISSN 2630-371X Kataložni zapis o publikaciji (CIP) pripravili v Narodni in univerzitetni knjižnici v Ljubljani COBISS.SI-ID=31893031 ISBN 978-961-264-139-9 (pdf) PREDGOVOR MULTIKONFERENCI INFORMACIJSKA DRUŽBA 2018 Multikonferenca Informacijska družba (http://is.ijs.si) je z enaindvajseto zaporedno prireditvijo osrednji srednjeevropski dogodek na področju informacijske družbe, računalništva in informatike. Letošnja prireditev se ponovno odvija na več lokacijah, osrednji dogodki pa so na Institutu »Jožef Stefan«. Informacijska družba, znanje in umetna inteligenca so še naprej nosilni koncepti človeške civilizacije. Se bo neverjetna rast nadaljevala in nas ponesla v novo civilizacijsko obdobje ali pa se bo rast upočasnila in začela stagnirati? Bosta IKT in zlasti umetna inteligenca omogočila nadaljnji razcvet civilizacije ali pa bodo demografske, družbene, medčloveške in okoljske težave povzročile zadušitev rasti? Čedalje več pokazateljev kaže v oba ekstrema – da prehajamo v naslednje civilizacijsko obdobje, hkrati pa so notranji in zunanji konflikti sodobne družbe čedalje težje obvladljivi. Letos smo v multikonferenco povezali 11 odličnih neodvisnih konferenc. Predstavljenih bo 215 predstavitev, povzetkov in referatov v okviru samostojnih konferenc in delavnic. Prireditev bodo spremljale okrogle mize in razprave ter posebni dogodki, kot je svečana podelitev nagrad. Izbrani prispevki bodo izšli tudi v posebni številki revije Informatica, ki se ponaša z 42-letno tradicijo odlične znanstvene revije. Multikonferenco Informacijska družba 2018 sestavljajo naslednje samostojne konference:  Slovenska konferenca o umetni inteligenci  Kognitivna znanost  Odkrivanje znanja in podatkovna skladišča – SiKDD  Mednarodna konferenca o visokozmogljivi optimizaciji v industriji, HPOI  Delavnica AS-IT-IC  Soočanje z demografskimi izzivi  Sodelovanje, programska oprema in storitve v informacijski družbi  Delavnica za elektronsko in mobilno zdravje ter pametna mesta  Vzgoja in izobraževanje v informacijski družbi  5. študentska računalniška konferenca  Mednarodna konferenca o prenosu tehnologij (ITTC) Soorganizatorji in podporniki konference so različne raziskovalne institucije in združenja, med njimi tudi ACM Slovenija, Slovensko društvo za umetno inteligenco (SLAIS), Slovensko društvo za kognitivne znanosti (DKZ) in druga slovenska nacionalna akademija, Inženirska akademija Slovenije (IAS). V imenu organizatorjev konference se zahvaljujemo združenjem in institucijam, še posebej pa udeležencem za njihove dragocene prispevke in priložnost, da z nami delijo svoje izkušnje o informacijski družbi. Zahvaljujemo se tudi recenzentom za njihovo pomoč pri recenziranju. V letu 2018 bomo šestič podelili nagrado za življenjske dosežke v čast Donalda Michieja in Alana Turinga. Nagrado Michie-Turing za izjemen življenjski prispevek k razvoju in promociji informacijske družbe bo prejel prof. dr. Saša Divjak. Priznanje za dosežek leta bo pripadlo doc. dr. Marinki Žitnik. Že sedmič podeljujemo nagradi »informacijska limona« in »informacijska jagoda« za najbolj (ne)uspešne poteze v zvezi z informacijsko družbo. Limono letos prejme padanje državnih sredstev za raziskovalno dejavnost, jagodo pa Yaskawina tovarna robotov v Kočevju. Čestitke nagrajencem! Mojca Ciglarič, predsednik programskega odbora Matjaž Gams, predsednik organizacijskega odbora i FOREWORD - INFORMATION SOCIETY 2018 In its 21st year, the Information Society Multiconference (http://is.ijs.si) remains one of the leading conferences in Central Europe devoted to information society, computer science and informatics. In 2018, it is organized at various locations, with the main events taking place at the Jožef Stefan Institute. Information society, knowledge and artificial intelligence continue to represent the central pillars of human civilization. Will the pace of progress of information society, knowledge and artificial intelligence continue, thus enabling unseen progress of human civilization, or will the progress stall and even stagnate? Will ICT and AI continue to foster human progress, or will the growth of human, demographic, social and environmental problems stall global progress? Both extremes seem to be playing out to a certain degree – we seem to be transitioning into the next civilization period, while the internal and external conflicts of the contemporary society seem to be on the rise. The Multiconference runs in parallel sessions with 215 presentations of scientific papers at eleven conferences, many round tables, workshops and award ceremonies. Selected papers will be published in the Informatica journal, which boasts of its 42-year tradition of excellent research publishing. The Information Society 2018 Multiconference consists of the following conferences:  Slovenian Conference on Artificial Intelligence  Cognitive Science  Data Mining and Data Warehouses - SiKDD  International Conference on High-Performance Optimization in Industry, HPOI  AS-IT-IC Workshop  Facing demographic challenges  Collaboration, Software and Services in Information Society  Workshop Electronic and Mobile Health and Smart Cities  Education in Information Society  5th Student Computer Science Research Conference  International Technology Transfer Conference (ITTC) The Multiconference is co-organized and supported by several major research institutions and societies, among them ACM Slovenia, i.e. the Slovenian chapter of the ACM, Slovenian Artificial Intelligence Society (SLAIS), Slovenian Society for Cognitive Sciences (DKZ) and the second national engineering academy, the Slovenian Engineering Academy (IAS). On behalf of the conference organizers, we thank all the societies and institutions, and particularly all the participants for their valuable contribution and their interest in this event, and the reviewers for their thorough reviews. For the sixth year, the award for life-long outstanding contributions will be presented in memory of Donald Michie and Alan Turing. The Michie-Turing award will be given to Prof. Saša Divjak for his life-long outstanding contribution to the development and promotion of information society in our country. In addition, an award for current achievements will be given to Assist. Prof. Marinka Žitnik. The information lemon goes to decreased national funding of research. The information strawberry is awarded to the Yaskawa robot factory in Kočevje. Congratulations! Mojca Ciglarič, Programme Committee Chair Matjaž Gams, Organizing Committee Chair ii KONFERENČNI ODBORI CONFERENCE COMMITTEES International Programme Committee Organizing Committee Vladimir Bajic, South Africa Matjaž Gams, chair Heiner Benking, Germany Mitja Luštrek Se Woo Cheon, South Korea Lana Zemljak Howie Firth, UK Vesna Koricki Olga Fomichova, Russia Mitja Lasič Vladimir Fomichov, Russia Blaž Mahnič Vesna Hljuz Dobric, Croatia Jani Bizjak Alfred Inselberg, Israel Tine Kolenik Jay Liebowitz, USA Huan Liu, Singapore Henz Martin, Germany Marcin Paprzycki, USA Karl Pribram, USA Claude Sammut, Australia Jiri Wiedermann, Czech Republic Xindong Wu, USA Yiming Ye, USA Ning Zhong, USA Wray Buntine, Australia Bezalel Gavish, USA Gal A. Kaminka, Israel Mike Bain, Australia Michela Milano, Italy Derong Liu, USA Toby Walsh, Australia Programme Committee Franc Solina, co-chair Matjaž Gams Vladislav Rajkovič Viljan Mahnič, co-chair Marko Grobelnik Grega Repovš Cene Bavec, co-chair Nikola Guid Ivan Rozman Tomaž Kalin, co-chair Marjan Heričko Niko Schlamberger Jozsef Györkös, co-chair Borka Jerman Blažič Džonova Stanko Strmčnik Tadej Bajd Gorazd Kandus Jurij Šilc Jaroslav Berce Urban Kordeš Jurij Tasič Mojca Bernik Marjan Krisper Denis Trček Marko Bohanec Andrej Kuščer Andrej Ule Ivan Bratko Jadran Lenarčič Tanja Urbančič Andrej Brodnik Borut Likar Boštjan Vilfan Dušan Caf Mitja Luštrek Baldomir Zajc Saša Divjak Janez Malačič Blaž Zupan Tomaž Erjavec Olga Markič Boris Žemva Bogdan Filipič Dunja Mladenič Leon Žlajpah Andrej Gams Franc Novak iii iv KAZALO / TABLE OF CONTENTS Delavnica AS-IT-IC / AS-IT-IC Workshop ........................................................................................................ 1 PREDGOVOR / FOREWORD ....................................................................................................................... 3 PROGRAMSKI ODBORI / PROGRAMME COMMITTEES ........................................................................... 4 Austrian-Slovenian Intelligent Tourist Information Center: Project Progress Report / Zupančič Jernej, Gams Matjaž ............................................................................................................................................ 5 Tourism Related ICT Tools: a Review / Grasselli Gregor, Zupančič Jernej ...............................................10 AS-IT-IC Databases / Zupančič Jernej, Tazl Oliver August, Mahnič Blaž, Grasselli Gregor .....................14 Content API - A Cloud-based Data Source for the AS-IT-IC Platform / Tazl Oliver August, Wotawa Franz.......................................................................................................................................................18 e-Tourist 2.0: an Adaptation of the e-Tourist for the AS-IT-IC Project / Grasselli Gregor ..........................20 Planning-based Security Testing for Chatbots / Bozic Josip, Wotawa Franz ............................................23 Indeks avtorjev / Author index ......................................................................................................................27 v vi Zbornik 21. mednarodne multikonference INFORMACIJSKA DRUŽBA – IS 2018 Zvezek E Proceedings of the 21st International Multiconference INFORMATION SOCIETY – IS 2018 Volume E Delavnica AS-IT-IC AS-IT-IC Workshop Uredila / Edited by Matjaž Gams, Jernej Zupančič http://is.ijs.si 12. oktober 2018 / 12 October 2018 Ljubljana, Slovenia 1 2 PREDGOVOR Delavnica AS-IT-IC omogoča predstavitev primerov uporabe ter izmenjavo izkušenj med znanstveniki in drugimi deležniki na področju pametnega turizma, ki ga omogočajo inteligentna orodja in storitve, podprte z informacijsko-komunikacijskimi tehnologijami (IKT), predvsem umetno inteligenco (UI). Delavnica omogoča krepitev vezi in sodelovanja med izvajalci praktičnih turističnih storitev in znanstveno-raziskovalno sfero in spodbuja uporabo naprednih rešitev v turizmu. Delavnica je ena izmed aktivnosti projekta Avstrijsko-Slovenski Turistično-Informacijski Center (AS-IT-IC), ki je bil sprejet na Programu sodelovanja Interreg V-A Slovenija-Avstrija 2014-2020. Glavni cilj projekta je operativni center, kjer ponudniki turističnih informacij in virtualni asistenti sodelujejo pri odgovarjanju na turistično orientirana vprašanja turistov in jim pomagajo pri načrtovanju izletov na Slovensko-Avstrijskem čezmejnem območju. Sprejeti prispevki opisujejo stanje projekta AS-IT-IC eno leto pred zaključkom projekta. Prispevek Avstrijsko-Slovenski Inteligentni Turistično-Informacijski Center: Poročilo o napredku projekta povzame napredek glede na projektne in programske kazalnike, med tem ko se ostali prispevki osredotočajo na posamezne komponente končnih projektnih rezultatov. V Pregledu IKT orodij v turizmu so predstavljene različne IKT rešitve za pomoč turistom in ponudnikom turističnih infomacij. V Podatkovnih zbirkah AS-IT-IC so predstavljeni podatki ter podatkovni servisi, ki so na voljo na platformi AS-IT-IC. V prispevku API za podatke, je predstavljen sistem za pridobivanje ter dostop do podatkov. V e-Turist2.0 je predstavljena nadgrajena verzija sistema za načrtovanje in priporočanje izletov. V prispevku Testiranje varnosti pogovornih asistentov z uporabo planiranja avtorji opisujejo napredni sistem za ugotavljanje varnostnih pomanjkljivosti pogovornih asistentov. INTRODUCTION The AS-IT-IC Workshop is a forum for presenting the use cases and exchanging experience among academic and service industry partners on deploying intelligent information communication technology, in particular artificial intelligence, supported tools and services for enabling smarter tourism, as well as stimulating further adoption of such solutions through promotional activities and establishing direct collaboration between academia and industry. The workshop is an activity of the Cooperation Programme Interreg V-A Slovenia-Austria 2014-2020 project Austrian-Slovenian Intelligent Tourist-Information Center (AS-IT-IC project). The main project output will be the operational center with human and virtual assistants enabling automatic answering to the tourism-oriented questions in natural language and performing services to enable trip planning in the Slovenian-Austrian cross-border region. Accepted papers describe the AS-IT-IC project state one year before the project conclusion. Austrian-Slovenian Intelligent Tourist-Information Center: Project Progress Report summarizes the project progress with respect to project and programme indicators, while other contributions focus on specific modules of the final project results. In Tourism Related ICT Tools: a Review different ICT solutions with the aim to help tourist and tourist information providers are presented. In AS-IT-IC Databases the data and data services made available through the AS-IT-IC Platform are described. In the Content API paper the system for retrieving and serving the data is presented. In e-Tourist2.0 authors write about the upgraded trip planning and recommendation solution. Finally, the smart security testing for security leaks for common attack scenarios is presented in Planning-based Security Testing for Chatbots paper. 3 PROGRAMSKI ODBOR / PROGRAMME COMMITTEE Matjaž Gams, IJS (chair) Franz Wotawa, IST (co-chair) Josip Božič, IST Jernej Zupančič, IJS Tomaž Šef, IJS Oliver August Tazl, IST Dieter Hardt-Stremayr, GRAZ Katarina Čoklc, ZOS Marija Lah, SPOTUR 4 Austrian-Slovenian Intelligent Tourist Information Center: Project Progress Report 2018 Jernej Zupančič Matjaž Gams Jožef Stefan Institute and Jožef Stefan Institute and Jožef Stefan International Postgraduate School Jožef Stefan International Postgraduate School Jamova cesta 39 Jamova cesta 39 Ljubljana, Slovenia Ljubljana, Slovenia jernej.zupancic@ijs.com matjaz.gams@ijs.si ABSTRACT missed, stay longer and better satisfy their needs. Local Austrian-Slovenian Intelligent Tourist Information Center communities will easily offer local services and information (AS-IT-IC) is a project that was accepted in the Cooper- to visitors, e.g. a tour might include visiting a specialized ation Programme Interreg V-A Slovenia-Austria 2014-2020 craftsman and boost the selling of local products. Tourist call. The project goal is to create a joint Austrian-Slovenian officers will get better access to tourists. AS-IT-IC project center – an information and communication technology (ICT) (Table 1) provides the integration of virtual and human ser- supported network of service providers and tourist offices, vices from Austria and Slovenia with the uniform functional- municipalities, tourists and citizens to enhance continuous ity – to provide most relevant information, attract tourists, cooperation between them. The main project output will be and prolong their stay. the AS-IT-IC operational center with humans involved, hav- ing support of the ICT tools for communication, automatic question answering in natural language, information provi- Table 1: Project information card sion, trip recommendation and trip planning. This paper Title Austrian-Slovenian Intelligent Tourist overviews the current state of the project progress. Information Center Partners 1. Institut “Jožef Stefan” (lead partner) Keywords 2. Technische Universität Graz, Institut virtual assistants; chatbots; chat platforms; tourism; natural f˝ ur Softwaretechnologie language understanding; AS-IT-IC project 3. Javni zavod za turizem, šport, mladinske in socialne programe 1. INTRODUCTION SPOTUR Slovenj Gradec According to [1] a tourist cannot get the desired informa- 4. Združenje občin Slovenije tion in an integrated way from both humans and Web ser- 5. Graz Tourismus und Stadtmarketing vices, and much less the joint Austrian-Slovenian services. GmbH Typically, Slovenian or Austrian tourist office will provide only predefined national tours and not user-centric cross- Duration From 1. 7. 2016 to 30. 6. 2019 border tours. As a consequence, tourists may miss loca- tions they might be interested in visiting and tourist loca- The rest of the paper goes as follows. In Section 2 we de- tions get less visits. The goal of the project is to create scribe the state of deliverables and project workpackages, a joint Austrian-Slovenian center – an ICT supported net- in Section 3 we describe the project idea, while Section 4 work of service providers and tourist offices, municipalities, overviews the state of the prototypes and provides informa- tourists and citizens to enhance continuous cooperation be- tion on what has been accomplished by now. Sections 5 and tween them. Thus, cross-border tourist exchange, collabo- 6 describe the project dissemination activities and project ration and expertise transfer between providers will largely impact, respectively, so far. Section 7 concludes the paper. increase with respect to the current state. The main project output will be an operational center having support of the 2. PROJECT PROGRESS following tools: Virtual assistant (providing automatic an- The project has entered the last year of implementation swering in natural language to the questions and perform- (Figure 1). While the majority of the technical details (Ta- ing services according to demands from tourists), Commu- ble 2) have been resolved, the project results are still un- nication service (ICT solution that will enable conversation der active implementation and testing. Additionally, the between the tourists, virtual assistants, tourist information dissemination strategy and sustainability plan will be ad- workers and local communities), Information sources (inclu- dressed in more details in the coming months. sion of existing information sources), Recommender system for tour planning, Network of tourist services and services from local communities. 3. PROJECT IDEA The AS-IT-IC project tries to combine several solutions that The system will help tourists better plan their cross-border already provide partial solutions for smarter tourism: at- visits, discover less popular sites that would otherwise be traction discovery, trip planning [2], and communication 5 Figure 1: Project Gantt chart the state-of-the art platform that enables smarter tourism Table 2: Project deliverables status several open source technologies, data sources and internal Management Project reports 3/6 tools and services were examined, upgraded and are in the Dissemination and promo- not process of integration into one tourism platform – the AS- tion report started IT-IC platform. Using the open source software enables us Promotion material to start with a solid working solution and provide neces- Communication Publications 2/4 sary modifications as required by the project. A simplified Scientific publications 7/1 reference architecture is presented in Figure 2. Workshops on AS-IT-IC 4/3 Participation in tourist re- 0/3 3.1 Communication platform lated events The communication platform enables the users to commu- Project website nicate with each other (tourist – tourist, tourist – tourist System requirements and information provider, tourist – virtual assistant) over the Tourist specification chat based interface. Increased popularity of chat applica- information Tourist information plat- in progress tions (Facebook messenger1, WhatsApp2) prove that this is platform form a valid communication option used for exchanging and ob- Content items taining information. The main benefit being the option to Content creation guidelines upgrade the communication by integrating various virtual Communication applica- in progress assistant services. tions Virtual assistant require- Virtual 3.2 Virtual assistants ments assistant Virtual assistant service in progress Virtual assistants (also chatbots or conversational robots [3]) prototype are computer programs that can process input in natural Virtual assistant service not language and provide a reply. The input can either be voice started or text and the answer is usually a combination of a response in natural language and an action that was carried out by Tour planning requirements Tour taking into account the user input. An example would be as Tour planning service pro- in progress planning follows. User asks ”What are some cultural heritage sights I totype could visit near me?”. The virtual assistant would then first Tour planning service not identify the intent (the user would like to execute a search) started and the arguments (location: near the user, type of sight: AS-IT-IC AS-IT-IC Deliverable cultural heritage). Then it would acquire user location and Center AS-IT-IC Center in progress user preferences from the system and issue a request to the system database in order to obtain relevant attractions. The 1https://www.messenger.com/ with human and virtual assistants [3, 4]. In order to provide 2https://www.whatsapp.com/ 6 Figure 2: Simplified architecture corresponding to the project idea results would then be properly formatted and presented to As a base an open source team communication software the user: ”I have found the following attractions that match Rocket.Chat5 was chosen. In order to meet the project your search: Ljubljana castle, Cankar memorial house and requirements, several additions were developed: a custom Manor pavilions. Would you like to learn more about a home dashboard; a message modification for improved user specific attraction?”. experience; a custom information tab with information about the trip; notification modifications for better operation of 3.3 Trip planner mobile communication application; custom application pro- gramming interface for automatic message processing and Trip planners [2] help the user in planning the trip by keep- posting. Screenshot of the conversation user interface is pre- ing track of the places the user intents to visit, recommend- sented in Figure 3. ing attractions and points of interest relevant for the user, and automatically arranging the itinerary in order to op- timize the travel between the items on the itinerary. Trip 4.2 Virtual assistants planners usually also enable the user a visual overview of The virtual assistant used in the AS-IT-IC platform com- the whole trip and sometimes even enable the navigation. prises several modules. Two approaches were used when de- signing the assistant modules: rule-based approach (which is 3.4 Databases an upgrade of virtual assistants deployed at Jožef Stefan In- At the heart of the platform are the databases that provide stitute and the majority of Slovenian municipalities [3]); and all the information required by the AS-IT-IC services. The natural language based approach [4]. The rule based models databases provide structured data that can be used by sev- are more stable and easier to debug and understand, how- eral services for further processing. The databases consist ever, they have the issue of rule design, since every rule has of: the information about the attractions and other points- to be designed by hand, which is why they take a long time to of-interest (castles, caves, restaurants, etc.); the information implement. The natural language based modules, however, about the geographical entities (places, regions, rivers, mu- enables one to produce a virtual assistant that transforms nicipalities, etc.); and information scraped from the useful the natural language input into a structured format that can webpages (municipality information, opening hours, etc.). be further used by computer programs. The main disadvan- tage of such systems is the need of a language model (which Beside the ”Content” databases the system also requires is an active area of research, especially for smaller languages databases for user management and storing of the system such as Slovenian) and the need for a large set of training states and user generated content (conversation, saved trip data. itineraries, etc.). Within the AS-IT-IC Platform the rule-based approach is 4. PROTOTYPES used for the virtual assistant action that results from the The AS-IT-IC platform will consists of services deployed ei- user interaction with the uniquely identifiable objects present ther using Docker3 virtualization technologies or Flynn4 – a in the user interface (for instance buttons) and for com- self-hosted platform-as-a-service. mon text input provided by the user. The natural language based approach is used for intelligent search capabilities and in cases where the rule-based approach fails to work. Two 4.1 Communication platform backends are currently used for parsing the user input and 3https://www.docker.com 4https://flynn.io/ 5https://rocket.chat/ 7 Figure 3: Conversation user interface translating it into structured text – Dialogflow6 and Rasa7. tem; several functions that were previously coded by hand Additionally, in some cases we take the advantage of the were moved to the database, which significantly reduced the full-text search capability of the PostgreSQL8 database – we project size; and all the modules used by the application have made the required adjustments for the full-text search were upgraded to the latest versions, therefore increasing to work in all three project languages: Slovenian, English the application security. and German. 4.4 Databases Adapters were developed that enable the interaction be- Webpages and datasources that contain relevant tourism- tween the communication system and the virtual assistant oriented information and enable the use of the content for services: reading the user input, processing the text, per- non-commercial purpose were reviewed and gathered. While forming required actions and returning a response to the there are many datasources with relevant information (Slove- user. nia.info9, DEDI10, OPSI11, Geoportal ARSO12, e-Geodetski podatki13, eVode14, register kulturne dediščine15) there was 4.3 Trip planner additional work needed to unify the data formats, remove A basis for the trip planner used in the AS-IT-IC plat- the data that was not of sufficient quality and integrate all form was the e-Tourist application [2]. In order to meet the 9https://www.slovenia.info/en project requirements heavy modifications were made to the 10http://www.dedi.si/ e-Tourist codebase: the databases were extended to support 11https://podatki.gov.si/ additional attraction and geographical data; the routing was 12https://gis.arso.gov.si adjusted so that the trip planner can be used by third party 13https://egp.gu.gov.si/egp/ applications; user management was upgraded to enable AS- 14http://www.evode.gov.si/sl/vodni-kataster/ IT-IC platform users to automatically log-in into the sys- zbirka-vode/zbirka-podatkov-o-povrsinskih- vodah/hidrografija/ 6https://dialogflow.com/ 15http://www.mk.gov.si/si/storitve/razvidi_ 7https://rasa.com/ evidence_in_registri/register_nepremicne_kulturne_ 8https://www.postgresql.org/ dediscine/ 8 the data into one database. 2. Tourist workers networking: to change behaviour and increase the cooperation of tourist workers and providers 5. DISSEMINATION of ICT tools for tourism. Nine meetings and workshops were organized to promote AS-IT-IC and enable such Project partners have been active in disseminating the project networking. results, producing scientific papers at conferences and inter- national journals, producing publications for general public, 3. eHeritage: to raise the awareness of the need to make maintaining the project website, hanging project posters at the description of heritage sights and attractions avail- partner sites, and organizing workshops. able on the Internet in a way that enables easy search and the inclusion of such attractions into the tourists’ Scientific papers and publications for general public intro- itineraries. So far cca. 1000 heritage attractions were duced the project to the wider audience by presenting the inserted into the AS-IT-IC content database. project idea, describing the need for such a project, and presenting tools, services and prototypes, developed within 7. CONCLUSION the project. Project partners have so far contributed to 7 In the paper we presented the AS-IT-IC project, its goals scientific papers and 2 publications for general audience. and the issues it addresses. Project partners from the Slovenian- Austrian cross-border area came together in order to enable Project website16 was deployed in the first half year of the smarter tourism by integrating several tools and services into project. It presents all relevant information about the project one platform. The project has entered its last year of imple- and project partners, together with the project news. Addi- mentation and so far the development has gone according tionally it enables the visitors of the website to contact the to the plan. The prototypes for communication platform, project partners. virtual assistant, and trip planners are under development and will soon be ready for integration into the AS-IT-IC Project posters were designed according to the Coopera- platform and ready for testing by end users. tion Programme Slovenia-Austria rules and posted at part- ner sites (Jožef Stefan Institute, GUT Institute for Software The majority of the future work will be on disseminating technologies, and Association of municipalities of Slovenia). the project results and attracting a larger user base. To this end the tourist partners are in contact with the major tourist Workshops are one of the main dissemination channels, where organizations (such as Slovenian tourist board), which will partners invite general audience to attend or the partners enable us to reach a wider target audience and receive useful present the project and project results to the general au- user feedback. dience. Several workshops were organized, where AS-IT- IC project was presented: ”AS-IT-IC Workshop” within the Additionally, the partners will look into sustainable options Information society 2017 Multiconference, ”Presentation of for transferring the AS-IT-IC Platform management to in- tourism applications” workshop on behalf of the invitation in terested organizations. This will enable additional growth Nazarje, ”Artificial Intelligent into every municipality” work- of the AS-IT-IC Center, while the partners will maintain the shop organized at Jožef Stefan Institute, and a ”Site visit” functionality of the AS-IT-IC platform as developed for the workshop organized at Jožef Stefan Institute, where Jožef purpose of the project. Stefan Institute employees were invited. 8. ACKNOWLEDGMENTS 6. PROJECT IMPACT The work was co-funded by Cooperation Programme Inter- The AS-IT-IC project enables the project partners to greatly reg V-A Slovenia-Austria 2014-2020, project AS-IT-IC. increase the cooperation in the cross border area. The part- ners have organized five cross-border meetings, which re- 9. REFERENCES sulted in the exchange of information, data, examples of [1] AS-IT-IC Project partners. About the project. good practice in the field of tourism and also in additional project application. The partners also cooperated in the or- https://as-it-ic.ijs.si/about/, 2016. [Online; accessed 16-September-2018]. ganization of workshops, which enabled the project partners to reach several third party stakeholders: 7/50 (7 reached [2] B. Cvetković, H. Gjoreski, V. Janko, B. Kaluža, out of 50 promised) representatives from local public author- A. Gradišek, M. Luštrek, I. Jurinčič, A. Gosar, ities – entities were reached so far (ministries and munici- S. Kerma, and G. Balažič. e-turist: An intelligent palities); 10/30 representatives from interest groups includ- personalised trip guide. Informatica, 40(4):447, 2016. ing non-governmental organizations (development centers, [3] D. Kužnar, A. Tavčar, J. Zupančič, and M. Duguleana. tourist organizations); 7/5 small and medium sized compa- Virtual assistant platform. Informatica, 40(3):285, 2016. nies; and 89/3000 interested individuals. [4] J. Zupančič, G. Grasselli, A. Tavčar, and M. Gams. Virtual assistants for the austrian-slovenian intelligent The main communication goals of the project are: tourist-information center. In Proceedings of the 12th International Multiconference Information Society - IS 2017, volume E, pages 27–30, Ljubljana, Slovenia, 2017. 1. Integrated tourist communication service: to raise aware- Jožef Stefan Institute. ness of the AS-IT-IC platform. AS-IT-IC was men- tioned five times in press and web page news. 16https://as-it-ic.ijs.si 9 Tourism Related ICT Tools: a Review Gregor Grasselli Jernej Zupančič Jožef Stefan Institute and Jožef Stefan Institute and Jožef Stefan International Postgraduate School Jožef Stefan International Postgraduate School Jamova cesta 39 Jamova cesta 39 Ljubljana, Slovenia Ljubljana, Slovenia gregor.grasselli@ijs.si jernej.zupancic@ijs.com ABSTRACT and applications, when searching for the trip-related infor- In the paper we review the existing information and commu- mation; tourists to call or go to local tourist information nication technology (ICT) based tools and services that em- offices for additional info; tourist information providers to power tourists and tourist information providers in obtain- keep their information available in several systems all over ing and providing information needed for trip planning. We the internet and structured in different – often incompatible define four tourism-related service categories: search with ways; tourist information providers to pay for the presence booking, trip planners, chatbots, and forums. We summa- on tourism-tools and services. There is clearly room for im- rize the good practices identified in the reviewed tools and provement [3] and the first step would be the establishment expose the issues stemming from the fragmentation of the of an online platform that would in one place integrate sev- tools and data. In order to overcome the identified prob- eral services and tools and enable the users to gather and lems we propose the AS-IT-IC Platform – a project result of provide all necessary information related to trip planning. the Austrian-Slovenian Intelligent Tourist Information Cen- One such platform is being developed within the Austrian- ter (AS-IT-IC) project, accepted in the Cooperation Pro- Slovenian Intelligent Tourist Information Center (AS-IT-IC) gramme Interreg V-A Slovenia-Austria 2014-2020 call. The project. AS-IT-IC project aims to integrate several functionalities usually offered by distinct services, unifying the databases, The rest of the papers continues with the description of four and providing free access to tools and services for tourists categories of tools and services in Sections 2, 3, 4 and 5. and tourist information providers. Good practices are summarized in Section 6 and the need for an integrated solution is addressed in Section 7. Section Keywords 8 concludes the paper. virtual assistants; chatbots; tourism; trip planning; search; review; AS-IT-IC project 2. SEARCH WITH BOOKING The group offering the least functionality is the one of search engines for hotels, plane flights, restaurants and such. The 1. INTRODUCTION user is mostly required to input a predetermined set of pa- Due to the increase in the Internet usage it has become rameters, and is then referred to a list of matching options paramount that organizations and their tools are accessible they can choose from. This list can sometimes be further through the Internet. In the last few years a steep increase filtered and/or sorted based on additional criteria. After in the mobile usage has additionally fueled the development deciding on what suits them best, the user is relegated to of mobile applications and mobile friendly web applications, booking, which is how the site makes money. several of those in the domain of tourism. Of course most if not all hotels, airlines, tourist agencies and In this paper we present the tools provided by organizations other tourist providers have their own websites, which allow that operate on a global scale. We categorize the tools into a prospective user to directly book their services. The exam- four categories: ples below are of search engines that accumulate data from many such sites, or get it directly from the end providers and try to make it more convenient for a prospective tourist 1. Search with booking to find what they are looking for. 2. Trip planners 3. Chatbots – conversational robots • Sabre – https://en.eu.sabretravelnetwork.com/home/ page/book_flights_hotels 4. Forums • Expedia – https://www.expedia.com • Each category is unique: in the ways it helps the user in the TripAdvisor – https://www.tripadvisor.com process of trip planning; in the type of data it uses; and in • Priceline – https://www.priceline.com the target users. This, however, leads to the fragmentation of data and the need for: tourists to check several web sites • Yelp – https://www.yelp.com 10 • OpenTable – https://www.opentable.com/start/home 4.1 Customer service chatbots One of the trends in using chatbots is to automate customer • Booking.com – https://www.booking.com service on a company’s website. Things like providing an- swers to frequently asked questions, or finding relevant in- • Skyscanner – https://www.skyscanner.com formation on the website without having to manually search for it can easily be automated by chatbots. The aim of these 3. TRIP PLANNERS bots is to reduce the load on human customer service staff A slightly more interesting group is the group of trip plan- and to provide a better customer experience by making it ners. They also require the user to input a set of predeter- easier and faster to get answers from data on the website. In mined parameters, but unlike in the previous group where this context a chatbot is an addition to an existing website, the parameters are mostly the same on all websites, here often appearing as a chat window in one of the corners. they vary quite a lot from service to service. What is dif- ferent is also that these services mostly output a single trip Examples of such chatbots are: plan which can later be modified. After a user decides on their final trip plan, they are offered accommodation and • Ana – https://connectmiles.copaair.com/en/web/ transportation booking necessary for the trip’s realization. guest/ask-ana The booking stage on some of the websites utilizes services from the first category. Some interesting examples of such • Julie – https://www.amtrak.com/about-julie-amtrak- services are provided below. virtual-travel-assistant 4.2 Instant messaging chatbots • Roadtrippers – https://roadtrippers.com With rising popularity of instant messaging platforms like Facebook’s messenger6 and Telegram7 that expose APIs for • Mapquest – https://www.mapquest.com/routeplanner bots to converse with users, many bots now exploit this channel for access to users. By responding to messages in • Triphobo Tripplanner – https://www.triphobo.com/ group chats (possible on Facebook’s messenger for instance), tripplanner chatbots are a new way for making a product more discov- erable and for making the user’s trip from wish to purchase • Inspirock – https://www.inspirock.com shorter. • Sygic Travel – https://travel.sygic.com In many cases chatbots present on an instant messaging plat- • routeperfect – https://www.routeperfect.com/trip- form are just a different interface to an existing service. Ex- planner amples of such bots are: • wanderapp – https://www.wanderapp.me • Expedia – https://viewfinder.expedia.com/features/ • Go Real Europe – introducing-expedia-bot-facebook-messenger https://www.gorealeurope.com • Skyscanner – https://www.skyscanner.net/news/tools/ 4. CHATBOTS skyscanner-facebook-messenger-bot With the emergence of natural language processing tools • Cheapflights – https://www.cheapflights.co.uk/news/ like dialogflow1, wit2 and rasa3 that make it quite easy for cheapflights-chat-awards developers to implement chat based interfaces in a growing number of languages, there has been an explosion in the • Hello Hipmunk – https://www.hipmunk.com/hello number of applications and websites that offer a chat based interface. There are even companies like Botflux4 that offer There are a number of chatbots that act as aggregators over their customers custom made chatbots. different services. They provide a conversational interface for searching offers from many sources and providing the Facing the consumer, chatbots offer a more dynamic (re- user with the result that most closely matches their require- quest values for different parameters based on the user’s in- ments. An example of such a bot is Assist8, a chatbot that put so far) and interactive approach to defining the user’s aggregates several services for making hotel reservations, requirements than classical forms and menus. They also of- ride hailing, making table reservations and online shopping. fer a completely new experience when using a voice interface It is the only product of a start up with the same name and through a smartphone or a specialized device such as Ama- can be used through messenger, Telegram, SMS, Twitter, zon’s Echo or Google Home. A good resource for finding in- Google Assistant and Slack. teresting chatbots based on the messaging platform, where one can converse with them, and their area of expertise is According to statista9, business travel in 2016 amounted to botlist.co5. 1.3 trillion USD and represented about 10% of all travel 1https://console.dialogflow.com 6https://www.messenger.com/ 2https://wit.ai 7https://telegram.org/ 3https://rasa.com 8http://www.assi.st/ 4https://www.botflux.com/tourism 9https://www.statista.com/topics/2439/global- 5https://botlist.co/ business-travel-industry 11 spending in 2015. Thus it makes sense that a number of In most cases travel related forums appear as part of a big- travel chatbots are specifically targeting business travellers: ger travel related website. The popular TripAdvisor website also includes a typical travel forum13 of this kind. Questions and responses are checked for destinations and attractions • Carla – https://www.cwtcarla.com/CarlaWeb TripAdvisor knows and if any are found, they are displayed in a card, below the user’s post, showing their name, a pic- • Pana – https://pana.com ture and their ranking. Clicking on the card will show the site for that attraction. • MEZI – https://mezi.com Another example of a website that also includes a forum is Lonely Planet14. According to Wikipedia15 they are the A different kind of application is a so called virtual concierge. largest travel guide publisher in the world. They also pro- Its main goal is to assist in communication between hotel vide a website that would fall into the search with booking staff and their guests by providing an interface for checking category, coupled with their newsletter and of course selling into a hotel, ordering room service, and requesting informa- travel guides. This rather expansive website also includes a tion about the hotel. They have automatic translation inte- forum for exchanging ”travel advice, hints and tips” as they grated into the service, so customers can interact with the put it. The forum is not limited to country based discus- staff in their native language. These applications are in this sions but also has (among others) sections about equipment, category because they mainly function through a conversa- travel health and vaccinations, searching for travelling com- tional interface and use instant messaging technology. Also panions, house sitting and swapping as well as people selling some of their functions are fully automated so they are bots and buying stuff through the forum. and not just messaging apps with translations. An example of such a virtual concierge is The Besty10, a phone app to According to the quick analysis of posts, Fodor’s Travel help people communicate with their hotel’s staff as well as Talks Forum16 seems to be the most popular. They also find and book tours, restaurants and activities at the “low- sell guidebooks and have a very extensive website that also est” prices. They also offer tour guides and live chat with lo- offers hotel bookings. cal tour experts. The MEZI chatbot, mentioned earlier also offers a virtual concierge service of this kind as part of its As we are living in the age of social media, it would be remiss capabilities. Another example of this kind of chatbot is Hi not to mention the #travel tag on Twitter17, used to post Jiffy11. It is available on messenger, and allows searching for about travel experiences, as well as the existence of quite hotels and making reservations in addition to its customer a number of twitter accounts that are dedicated solely to care functionality. It employs a model where queries that travel news18. cannot be answered automatically are forwarded to hotel staff. The query and the provided answer are then included 6. GOOD PRACTICES in the bot’s training set so that it can answer automatically By reviewing the existing ICT solutions available on the in- when a similar query is input by a user. According to its ternet, the following commonalities and good practices were website 77% of its answers are provided automatically at the observed. time of this writing. 5. FORUMS 1. Since almost 53% of all internet traffic in 2017 was When people want opinions, recommendations or advice, produced through mobile devices19, having a mobile they turn to the forums, where they can ask questions re- application or a different way of making the application lated to their planned excursions and get answers from trav- work on a mobile phone (like through messenger, or a ellers who have been there before. On some of the forums, mobile-first web application) is a must. travel agents seem to be quite involved as well, answering 2. When possible it is a good idea to integrate with appli- questions by prospective tourists while advertising their ser- cations users are already using in their everyday life, vices. All of the forums we came across cover travelling to like calendars. This allows to get user data without the whole world, but usually have a different section for each needing the user to type everything, as well as enables continent which is then further divided by country. the user to use the results of an application more con- veniently. Most forums also have a section dedicated to posting longer accounts of travellers who believe they have experienced 3. Integrating multiple data sources into a single view something worth sharing. An interesting website that col- is very helpful for users as they get more complete lects longer posts by travellers as well as photos is Travel- blog12. In addition to their forum for discussing travel plans 13https://www.tripadvisor.com/ForumHome and asking for advice, they also have a blog section, where 14https://www.lonelyplanet.com/thorntree/welcome anyone can write about their experiences or post a photo 15https://en.wikipedia.org/wiki/Lonely_Planet they think is particularly eye catching, and the rest of the 16https://www.fodors.com/community users will vote on the best blog and photo of the week. 17https://twitter.com/ 18http://mashable.com/2012/08/04/travel-twitter/ 10https://thebesty.com/ #KtwUdPjm_Gqw 11http://hijiffy.com/ 19http://www.trendreports.com/article/technology- 12https://www.travelblog.org in-tourism 12 information about a destination and do not have to 3. Cooperation of several local tourist information cen- check multiple sources on their own. ters with vast knowledge on the touristic offers in their area. 4. Many popular applications scrape provider websites for information and special offers. Others rely on providers 4. Free access for tourists and tourist information providers. to manually enter all information. 5. Open access to data and data services. 5. Sites that cover a wider geographical area are more useful, since they provide a one-stop shop for the whole trip as opposed to having to visit several websites to The AS-IT-IC Platform, however, does have one big disad- get informed on each destination individually. vantage compared to the rest of the services – it is a publicly founded project with reserved funds for the development and 6. Availability through multiple channels. Having a web- initial activities to raise the project awareness and dissemi- site is fine, but also being available through other chan- nate results. After the end of the project, no resources have nels, especially instant messaging platforms really helps been granted yet to further promote project results. While with discoverability. the partners have committed to maintain the project results 7. Until the invention of general AI, machines will be lim- for another 5 years after the end of the project, the issue of ited in what they can do, so to minimize customer frus- getting sustainable funds to enable further promotion and tration, keeping humans in the loop on the provider dissemination is yet to be solved. side can be very helpful. 8. CONCLUSION 8. Making customization of automatically generated trip In this paper the ICT tools provided in order to empower plans and other suggestions as easy and complete as smarter tourism are presented. Providers from around the possible, or the users will only use the tool to get the world were taken into account. The tools were classified into suggestions then they will use more low level tools to categories in order to provide a sense of what is available for actualize the parts they liked. This lowers the conver- tourists and tourist service providers. Additionally, the tools sion rate of the tool and makes customers less happy. were critically assessed and good practices were identified. 9. Allowing users to filter and sort displayed information Further, the AS-IT-IC Platform was compared to existing based on their interests. A good example is how Road- tools and main similarities and differences were pointed out. trippers allows users to set what kinds of points of in- terest they want to see on the map. The review provides a basis for anyone interested in the deployment of tourism-oriented services. One has to take 7. RELATION TO AS-IT-IC into account, however, that not every problem in tourism has a technological solution. One of the main components of The AS-IT-IC project tries to combine several partial so- the AS-IT-IC project is the networking one, where the goal lutions already implemented by the ICT tools mentioned is to connect several stakeholders that provide technology in this paper: attraction discovery, trip planning [1], and solutions to the users in need of such solutions. communication with human and virtual assistants [2, 4]. In order to provide the state-of-the art platform that enables smarter tourism several open source technologies and data 9. ACKNOWLEDGMENTS sources were utilized and integrated into one tourism plat- The work was co-funded by Cooperation Programme Inter- form – AS-IT-IC platform. reg V-A Slovenia-Austria 2014-2020, project AS-IT-IC. AS-IT-IC project empowers tourists by: helping in obtain- 10. REFERENCES ing all the required information related to trip planning in [1] B. Cvetković, H. Gjoreski, V. Janko, B. Kaluža, one place; and enabling discovery of local and less known A. Gradišek, M. Luštrek, I. Jurinčič, A. Gosar, but still relevant attractions. Further, it empowers tourist S. Kerma, and G. Balažič. e-turist: An intelligent information providers by: providing an integrated way of ex- personalised trip guide. Informatica, 40(4):447, 2016. posing the tourism content in his or her area to the Internet; [2] D. Kužnar, A. Tavčar, J. Zupančič, and M. Duguleana. and enabling the access to the tourists in an asynchronous, Virtual assistant platform. Informatica, 40(3):285, 2016. modern chat-based style. [3] B. Peischl, O. A. Tazl, and F. Wotawa. Open questions of technology usage in the field of tourism. In The reviewed tools, together with traditional communica- Proceedings of the 12th International Multiconference tion methods, indeed already offer the same or at least very Information Society - IS 2017, volume E, pages 29–22, similar functionality as is planned for the AS-IT-IC plat- Ljubljana, Slovenia, 2017. Jožef Stefan Institute. form. However, even disregarding the obvious benefit of the [4] J. Zupančič, G. Grasselli, A. Tavčar, and M. Gams. functionality integration into one platform, there are still ad- Virtual assistants for the austrian-slovenian intelligent vantages of the AS-IT-IC project results, for the time being tourist-information center. In Proceedings of the 12th mainly for the Slovenian-Austrian cross-border area: International Multiconference Information Society - IS 2017, volume E, pages 27–30, Ljubljana, Slovenia, 2017. 1. Larger database of attractions. Jožef Stefan Institute. 2. Inclusion of path-based attractions – for instance wine roads, or walking trips and geographical information. 13 AS-IT-IC Databases Jernej Zupančič Oliver A. Tazl Blaž Mahnič “Jožef Stefan” Institute and Institute for Software “Jožef Stefan” Institute Jožef Stefan International Technology Jamova cesta 39 Postgraduate School Graz University of Technology Ljubljana, Slovenia Jamova cesta 39 Inffeldgasse 16b blaz.mahnic@ijs.si Ljubljana, Slovenia Graz, Austria jernej.zupancic@ijs.com oliver.tazl@ist.tugraz.at Gregor Grasselli “Jožef Stefan” Institute and Jožef Stefan International Postgraduate School Jamova cesta 39 Ljubljana, Slovenia gregor.grasselli@ijs.si ABSTRACT Roadtrippers3 and Triphobo Tripplanner4. Austrian-Slovenian Intelligent Tourist Information Center (AS-IT-IC) is a project that was accepted in the Cooper- Virtual assistant enables the user to obtain tourism-related ation Programme Interreg V-A Slovenia-Austria 2014-2020 information using a rich-text based interface, similar to the call and has two main goals: one is to build information ones provided by Facebook Messenger5 . Examples include and communication technnology (ICT) tools to support the Hello Hipmunk6 and MEZI7. tourist when he or she creates personalized itinerary for the visit of Slovenian-Austrian cross-border area; and the sec- Forums provide a place where usually users but sometimes ond is to create a sustainable community that will support also professionals provide descriptions of their trips, express the use of the tools. In this paper we describe the provision, their opinions about attractions and places to visit, and pro- cleaning, integration and deployment of data and data ser- vide helpful advice to fellow travellers. Examples include vices needed by the ICT tools in tourism. Data and data Fodor’s Travel Talks Forum8 and Lonely Planet9. services form one of the main pillars that enables the AS- IT-IC platform to provide tools and services, which could In order to provide such services developers need data. The serve tourism-related information to end users – tourists and data can be obtained in several ways: tourist information providers. 1. The information about attractions and other points of Keywords interest can be obtained in advance (as is the case with web spider; data; tourism; databases; web services; AS-IT- some trip planners and virtual assistants). IC project 2. By making an application programming interface (API) 1. INTRODUCTION call to an external service (search with booking). There is an increasing number of services and applications 3. By relying on users to provide the content when the available for tourists and tourist information providers across application is already live (forums). the Internet. The services could be roughly categorized into the following categories: Search with booking ; Trip planners; Virtual assistants; and Forums. The ICT tools that will be used directly by the users and are developed within the AS-IT-IC project integrate several ser- Search with booking enables the user to search for a type vices that belong to the before-mentioned categories: com- of accommodation, transport or adventure, specifying the munication platform that enables communication of tourists time interval of using the services and in some cases even with tourist information providers and virtual assistants; buying the service or reserving it. Examples are Expedia1 virtual assistant, which provides useful information 24/7; and OpenTable2. 3https://roadtrippers.com 4 Trip planners enable the user to view attractions in certain https://www.triphobo.com/tripplanner 5 area, obtain additional attractions by clicking on them and https://www.messenger.com/ 6 forming a customized trip. Examples of such services are https://www.hipmunk.com/hello 7https://mezi.com 1https://www.expedia.com 8https://www.fodors.com/community 2https://www.opentable.com/start/home 9https://www.lonelyplanet.com/thorntree/welcome 14 and tour planning for the automatic creation of a trip. In fourteen data categories on the web site: Population order to provide useful tourist information services, we had and Society, Justice, the legal system and public safety, to combine several data procurement options, also used by Public Sector, Education, Culture and Sport, Social other systems that usually cover only a part of the AS-IT-IC and employment, Health, Environment and Spatial Plan- functionality. ning, Transport and infrastructure, Agriculture, fish- eries, Forestry and nutrition, Finance and Taxes, Econ- Additional difficulty in obtaining the data was the fact that omy, Energy, Science and Technology and International AS-IT-IC covers the Slovenian-Austrian cross-border area, Affairs. Some of the datasets available on the portal for which little structured information is available. This is are of interest also for the tourism domain, for instance especially true for natural and cultural heritage attractions, a computer readable map of bodies of water, where which are the focus of the Interreg Slovenia-Austria Pro- bathing is possible. gramme. We reviewed several data sources that are avail- able for non-commercial use and tried to include the most 5. Slovenian Cultural heritage register14. This is an of- relevant and quality ones. ficial database of cultural heritage on the territory of the Republic of Slovenia, provided by the Ministry of According to the review, the following types of data was Culture of Republic of Slovenia. The registry contains identified as useful: attraction data and tourism-related points- 30.095 entries of several types. The big disadvantage of of-interest (natural and cultural heritage, sights, activities, this database, however, is that the use of the database accommodation, places to eat etc.); geographical data (struc- is prohibited for online applications, which is a big tured representation of geographical entities such as rivers, drawback in the information age – especially since the lakes, municipalities, cities etc.); and services related to get- data is of public interest. ting from place A to place B, so called routing services. 6. Graz Tourism database. The data of this website com- prises the tourism sights, attractions and offers of the In the rest of the paper we review available data sources in city of Graz and its neighbouring regions. The data is Section 2. Further, we describe the data related to attrac- available via a back-end using a REST-JSON API. The tions in Section 3, the geographical data in Section 4, and data is maintained by Graz Tourism and the tourism the chosen routing system in Section 5. Section 6 provides partners to provide detailed and high-quality data. a brief description of how the data will be used within the AS-IT-IC Platform and Section 7 concludes the paper. 2. DATABASES AND DATA SOURCES The following data sources were identified as the most rele- vant: 1. Slovenia.info website10 (Figure 1). The data source comprises 8798 tourist attractions and is still growing, due to the fact that the tourist information providers are constantly uploading and updating new attraction descriptions. 2. Dedi.si website11. The data provided by this website comprises only natural and cultural heritage, there- fore being very suitable for the purpose of the project. Figure 1: Tourist attractions from Slovenia.info However, due to the incompatible data formats, the inclusion of the Dedi.si data is postponed for the time 3. ATTRACTIONS DATA being. This sections describes all the attributes used to represent 3. Europeana website12. This data source stores the data an attraction datum. Data was structured in an appropri- about artworks, artifacts, books, films and music from ate way that will enable the AS-IT-IC services to provide European museums, galleries, libraries and archives different kind of functionality. from around the world. The number of all entries is around 58 millions. However, due to the automatic The data was imported into the PostgreSQL15 database, data collection the information is very often wrong. where each datum insert is represented as follows (the data We have decided to not include the Europeana data. structure is based on the already developed e-Tourist sys- tems [1]): 4. Open data portal of Slovenia13. The portal provides • title (sl, en, de, it): the title of attraction stored in the information, tools, and useful resources, which can four languages – Slovenian, English, German, Italian. be used in web and mobile applications. There are For example “Gostišče Kimovec” 10https://www.slovenia.info/en/map 14http://www.mk.gov.si/si/storitve/razvidi_ 11http://www.dedi.si/ evidence_in_registri/register_nepremicne_kulturne_ 12https://www.europeana.eu/portal/en dediscine/ 13https://podatki.gov.si/ 15https://www.postgresql.org/ 15 • description (sl, en, de, it): description of tourist at- regarding other natural bodies was retrieved from the ARSO traction in four languages Geoportal18. • category (sl, en, de, it): for example “Adrenaline sports” • subcategory (sl, en, de, it): for example “Paragliding” Water bodies data contains the information in the GeoJSON • location: GPS coordinates of tourist attraction, for format19: example “(45.94,13.71” 1. 48 bathing areas. • figure: image that represents the tourist attraction 2. The Slovenian coast. (web path) 3. 20 lakes and larger bodies of water. • trip advisor: attraction rating retrieved from TripAd- 4. 165 rivers. visor16 web site • address: “Zgornji Hotič 15, 1270 Litija“ Additional data on natural heritage also contained the data • recommended viewing duration time: “1:30:00“ in the GeoJSON format: • price range: how much does it cost to visit the attrac- 1. 17 protected areas not included into Natura 2000. tion “1-5” 2. 307 areas of ecologic importance. • expert evaluation: what is the expert opinion in the 3. 10.730 caves with descriptions added. quality of the attraction “1-5” 4. 357 Natura 2000 areas. • parking: parking options 5. 2.657 items from the registry of natural heritage. • campers: availability of camper parking 6. 517 additional protected areas. • web page: “www.gostisce-kimovec.com” • phone: “05 458 654” The geographic information data on statistical regions and • working hours: “mo-fr: 8:00-18:00” settlements has been acquired from GURS (Slovenian Geode- • working hours comment: “Always opened” tic administration) portal e-Surveying data20. All the data • accessibility: how can one visit the attraction “(car, was obtained in the GeoJSON format and it included: Bound- walk, bike, boat, bus)” aries of 12 statistical regions; Boundaries of 6037 settle- • keywords: few keywords that relate most to the at- ments. traction 5. ROUTING SERVICE According to the data analysis, the data categories are pre- There are several routing services available on the Internet, sented in Table 1. the most popular being the Google maps21. While being practical the subscription services are not cost effective and Table 1: Category counts for the attraction data not in line with the project goals. We have therefore looked Category Count into open source solutions available. The most popular open source solution identified was the Open Street Routing Ma- Accommodation 2040 chine [2] (OSRM). Adrenaline sports 63 Casinos 59 We have downloaded the map data22 and combined the Cities 649 Austrian and Slovenian maps into one file using the rec- Culture 2337 ommended osmconvert23 tool. Then we processed the data Cycling and biking 326 according to the official OSRM instructions24. The authors Food and wine 1857 of the OSRM tool also provide a Docker25 image that can Hiking 142 be used for the processing of the maps data and for serving Nature 703 the routing back-end. We have utilized the OSRM docker Spas and health resorts 287 image in order to process and deploy three distinct routing Sports 72 services: for walking, cycling and car riding. This enables Water activities 208 the service to recommend routes to the user based on his or Winter sports 55 her preferred way of traveling. Services are currently available through the API calls, for There are additional 103 subcategories that further classify instance, when the service requires a route from point A to each attraction datum or point-of-interest, however, they are point B, it issues an API call to: not listed here due to the space reasons. http://docker-e9.ijs.si:5007/route/v1/driving/LON-A, 4. GEOGRAPHICAL DATA LAT-A;LON-B,LAT-B?steps=false The database of Geographical data is composed of two parts; 18https://gis.arso.gov.si/geoportal/catalog/main/ geographical data and statistical regions. Geographical data home.page 19 contains geographical data for Slovenia such as lakes, rivers, https://tools.ietf.org/html/rfc7946 20 caves etc. The data related to the water bodies was obtained https://egp.gu.gov.si/egp/ 21 from the “eVode portal”17 (eng. eWaters), while the data https://cloud.google.com/maps-platform/ 22http://download.geofabrik.de/europe.html 16 23 https://www.tripadvisor.com/ https://wiki.openstreetmap.org/wiki/Osmconvert 17 24 http://www.evode.gov.si/sl/vodni-kataster/ https://github.com/Project-OSRM/osrm-backend/ zbirka-vode/zbirka-podatkov-o-povrsinskih- wiki/Docker-Recipes vodah/hidrografija/ 25https://www.docker.com/ 16 The service returns a JSON response with the most impor- the use in public interest. This is the case with the registry tant objects: route specifications in the form of an encoded of cultural heritage. polyline; route distance in meters; and route travel time in seconds. Additional information about the API can be ob- In the future we plan to integrate additional data sources tained on the official OSRM website26. into the AS-IT-IC databases – by performing data-fusion procedures we will merge the data into a single, richer database. Additionally, we will try to provide the data services to third 6. DATA SERVICES IN AS-IT-IC party developers that now have to go through the same pro- The described data and data services will enable the AS-IT- cedure as we did, in order to obtain similar data. Open IC services to provide the functionality as required by the data and services was one of the main project goals from the project. start, since we want to improve the tourist experience not only directly but also indirectly by providing services that Th attractions database enables: the virtual assistant to will enable third party developers to come up with their own search for points-of-interest that best match the users query, innovative solutions. recognize points-of-interest entities, and fetch information about the attraction; the tour planning service to take into 8. ACKNOWLEDGMENTS account the attractions locations and provide recommenda- The work was co-funded by Cooperation Programme Inter- tions to the user based on the attraction category, subcat- reg V-A Slovenia-Austria 2014-2020, project AS-IT-IC. egory, location and similarity to other attractions based on the attraction description; the communication platform to present the data about attractions through the familiar user 9. REFERENCES interface. [1] B. Cvetković, H. Gjoreski, V. Janko, B. Kaluža, A. Gradišek, M. Luštrek, I. Jurinčič, A. Gosar, The geographical data enables: the virtual assistant to rec- S. Kerma, and G. Balažič. e-turist: An intelligent ognized geographical entities, search using the geographical personalised trip guide. Informatica, 40(4):447, 2016. position qualifiers (e.g. “cultural heritage in the Ljubljana [2] D. Luxen and C. Vetter. Real-time routing with city”); the tour planning service to enable recommendation openstreetmap data. In Proceedings of the 19th ACM based on the exact location and geographical area bound- SIGSPATIAL International Conference on Advances in aries. Geographic Information Systems, GIS ’11, pages 513–516, New York, NY, USA, 2011. ACM. The routing service enables: the virtual assistant to take into account the tourist travel options (e.g. “show me natural heritage sites that I can reach in one hour by a bicycle”); the tour planning service to calculate optimal travel plan (since it takes into account the geographical position of the attractions on the itinerary and the transport option chosen by the tourist) and to provide a preview of the trip on a map. 7. CONCLUSION In the paper we described the data sources, procurement, structure and types of data made available for the AS-IT-IC platform. Additionally we provided a short description of services, which are possible due to the data availability. The main problem with the tourism-related data procure- ment is the unavailability of data in a structured, easily ac- cessible format. The portal “Odprti Podatki Slovenije”27 (eng. Slovenian Open Data) for instance is a good start, however, there are still problems with the discoverability, data formats and data availability. Several good data sources were identified only after weeks of searching for relevant data over the Internet. In order to obtain relevant information users with non-commercial intent can still use web-scrapping in order to obtain the desired data, add the authorship no- tice and link to the original data source, however, this leads to fragmentation of data structures, additional stress on In- ternet bandwidth and non-optimal solutions to keep the data updated. Additional problem is the unavailability of data for 26https://github.com/Project-OSRM/osrm-backend/ blob/master/docs/http.md 27https://podatki.gov.si/ 17 Content API - A Cloud-based Data Source for the AS-IT-IC Platform Oliver A. Tazl Franz Wotawa Institute of Software Technology Institute of Software Technology Graz University of Technology Graz University of Technology Inffeldgasse 16b/II, 8010 Graz, Austria Inffeldgasse 16b/II, 8010 Graz, Austria oliver.tazl@ist.tugraz.at wotawa@ist.tugraz.at ABSTRACT 2. Tourism companies (e.g. hotels, restaurants,...) This paper introduces the design and implementation of an element of a microservice application for supporting a mod- 3. Natural Heritage (e.g. rivers, mountains,...) ern application for tourist information. Therefore, we intro- duce the Content Application Pogramming Interface (API) 4. Events system, a microservice, which collects tourism relevant data from multiple sources and provides it to serveral services 5. Cultural heritage within the Austrian-Slovenian Intelligent Turist Information Center (AS-IT-IC) plattform in turn. Content API is built using modern technologies and frameworks, like Docker1, The goal of the content subsystem is to integrate various in- Spring 2 or Vaadin 3. formation sources about tourism offers into a single database. This database is meant to be updateable from these sources, Categories and Subject Descriptors like Google or other tourim websites. The integrated user H.2 [Database Management]: Systems; I.2 [Artificial interface also allows human-computer interaction in order to Intelligence]: Distributed Artificial Intelligence—Intelli- update and review the integrated data. This allows a collab- gent agents; K.6 [Management of Computing and In- orative approach that enables users to add new information formation Systems]: Software Management—Software de- pieces to our database and helps to keep the database up- velopment to-date as well. General Terms The remainer of the paper is organised as follows: In the next section, we present the chosen design of the system. In Database Section 3, we get into details of the architecture. Afterwards, Section 4 provides an overview about the implementation Keywords and the deployment of the system. Finally, we show some User interface, data aquisition, REST API related research and conclude the paper. 1. INTRODUCTION The AS-IT-IC platform4 was introduced in order to en- 2. DESIGN & ARCHITECTURE able tourists and tourism information provider to interact To fit the domain specific data requirements, we figured out and collaborate with each other. Several data sources ex- a data structure that allows to represent the data needed in ist from which the AS-IT-IC partners can retrieve informa- our platform, as shown in Figure 2. tion of sights, natural heritage as well as other events and tourism offers in the program area. The information from The Core Service Layer contains the main functionalities of these databases is not stored in an unified format, so in or- the web service. The layer hosts the data storage logic, the der to integrate this information into the AS-IT-IC platform data aquisition along with the merger functionalities. They it is nesseary to integrate those data into a fitting format. are encapsulated to ensure easy extensibility for new data The content subsystem, called Content API, integrates this sources to be included. We call these modules within the data and provides it to the AS-IT-IC ecosystem. The sys- initial startup sequence as well as on demand. tem collects information from other web sources in order to combine and complete the information for a specific offering. The possiblity to merge data from different sources together is crucial in order to present a good and up-to-date data The core items of these databases are: quality. This functionality is contained within the Core Ser- vice Layer. 1. Sights It is also possible to integrate new data sources in order to 1see docker.com add new information aspects into the database. 2see spring.io 3see vaadin.com The system is integrated into the architecture of the AS-IT- 4see as-it-ic.ijs.si IC platform as discussed in [1]. 18 In Figure 2, we show the Vaadin based UI containing sev- eral subforms to interact with the different entries from the database. 2.3 REST Interface The Representational State Transfer (REST) interface can be used to implement operations like create, read, update and delete (CRUD) the information of the content subsys- tem. There exist endpoints that represent the information in the database and allow to access them via JSON objects, i.e.: Figure 1: Layer of the Content API • Location 2.1 Update Process The update process collects several web data sources and • Points of interest (POI) integrates it into the database in an automated manner. • Equipment Every data source needs a specific update routine that is implemented to integrate the collectable data. These rou- • Transportation tines compare the recently collected data with the stored data stored and update the information in case that there is any change provided by the external source. The process We use these endpoints in order to access and modify the can be scheduled automatically or triggered manually. data in the system via HTTP-JSON calls. 2.2 User Interface 3. IMPLEMENTATION & DEPLOYMENT The web-based user interface (UI) allows the community to We use Docker within the implementation and the deploy- contribute information to the platform. It is possible to add ment step. Docker Compose is a tool for defining and run- new information items, such as sights or tourism offers, to ning multi-container Docker applications. In this project, the database. Reviewing existing items and updating them Docker Compose is used on the developer machine to set up represents also a very important opportunities for the user a testing environment, which configurations corresponds to community in order to allow a wiki-like contribution model. the productive pendant. 4. CONCLUSION In this paper, we presented the Content API, the content microservice and database of the AS-IT-IC platform. Here we focussed on the architecture and design of the system, as well as the implementation and the deployment. We also highlighted the use of the web-based user interface and the REST API as interaction possiblies. Finally we described the automated deployment using an application container technology as well as continous integration and deployment tools. 5. ACKNOWLEDGMENTS Research presented in this paper was carried out as part of the AS-IT-IC project that is co-financed by the Coopera- tion Programme Interreg V-A Slovenia-Austria 2014-2020, European Union, European Regional Development Fund. 6. REFERENCES [1] J. Zupancic, G. Grasselli, A. Tavcar, and M. Gams. Virtual Assistants for the Austrian-Slovenian Intelligent Tourist-Information Center. In Proceedings of the 20th International Multiconference INFORMATION SOCIETY - IS 2017, Volume E, 2017. Figure 2: Screenshot of the web-based UI 19 e-Tourist 2.0: an Adaptation of the e-Tourist for the AS-IT-IC Project Gregor Grasselli Jožef Stefan Institute Jožef Stefan International Postgraduate School Jamova 39 1000 Ljubljana, Slovenia gregor.grasselli@ijs.si ABSTRACT to provide the pieces of the user interface that are missing This article presents a new version of the e-Tourist sys- from Rocket.Chat. tem called e-Tourist2.0. The e-Tourist2.0 system is de- veloped within the Austrian-Slovenian Intelligent Tourist- Initially we believed that this role could be filled by the ex- Information Center (AS-IT-IC) project, accepted in the Co- isting e-Tourist after some light modifications. However, a operation Programme Interreg V-A Slovenia-Austria 2014- more thorough inspection of its capabilities and architecture 2020 call. The new e-Tourist2.0 system brings a number convinced us that extensive adjustments to the code base are of additional features with respect to the e-Tourist, such required. Additionally some of the functionality originally as location aware search, an implicit recommendation engine implemented in the application code is now supported by and a more interactive interface for trip planning. In the pa- the database itself which would lead to the deletion of sev- per we briefly explain the need for a new system, and present eral lines of code. The resulting conclusion was to write a the architecture and functionality of the e-Tourist2.0. new system using the latest tools and design patterns, while reusing the old code as much as possible. Keywords This rest of this article describes the need for a new system in trip planner; AS-IT-IC project; tourism; attractions; tourism Section 2, followed by a presentation of the new system Sec- database; recommendation engine; location aware search tion 3, where we present the features available through the application programming interface (API) of e-Tourist2.0. 1. INTRODUCTION In section 4 we present the system architecture and 5 con-Trip planners are applications that require a user to input a cludes the paper. number of predetermined parameters and then respond by offering the user a trip plan that can later be modified. The 2. WHY A NEW SYSTEM Austrian-Slovenian Intelligent Tourist-Information Center (AS- The required back-end functionality includes: IT-IC) platform provides a trip planner with a chat based user interface. It is built from three major components, each contributing to the final user experience. They are: 1. Trip planning 2. Location aware search 1. the front-end provided by a slightly modified version 3. Recommendation system of Rocket.Chat1 2. e-Tourist2.0 that fulfils the role of the back-end Initially e-Tourist was designed to plan trips visiting the coastal region of Slovenia.It was provided with a relatively 3. a conversational program (a bot) that takes user in- small list of tourist attractions in both regions and is capable puts, be they free form text input or button clicks, of planning a trip to either of them. The user can specify a from Rocket.Chat and generates responses, using data number of constraints like the exact start of their trip, and acquired from e-Tourist2.0 when necessary whether they want to eat along the way and the e-Tourist produces a nice trip recommendation. Based on how well In its function as the back-end for the AS-IT-IC platform, this works, we were convinced that all that needed doing was to add additional data with points for other regions of e-Tourist2.0 needs to support an assortment of queries over textual and geographical data (i.e. the ability to pro- Slovenia and Austria, spruce up the API so the bot would vide an answer to questions like “Which 5 points of inter- be able to use the system from outside, and we would have est are most similar to the Bled castle based on their de- our back end. This was presented in [1]. scriptions?” or “List all castles in the Gorenjska region”), a recommendation engine, that uses implicit data about user We are porting some of the trip planning capabilities of interest, as well as trip planning functionality. It also needs e-Tourist to e-Tourist2.0, and in this respect e-Tourist mostly meets our requirements, except for not being REST- 1https://rocket.chat/ ful, that is calls to the API needed session specific state with 20 them in order for it to return the correct result. Figure 1 2. Location aware search needs to be included. shows what the original user interface for trip planning looks 3. Proper database migrations have to be implemented. like. It shows the trip on a map and adds controls to add or remove points from it. 4. Database architecture needs to be reworked. 5. The way travel time was computed was very storage and time intensive. 6. e-Tourist is not REST-ful. 7. We need a more stateless way of user authentication. 8. Code base has to be updated to the newest framework versions. Besides the tree based structure mentioned in the intro- duction, that becomes unnecessary when storing geographic data correctly, the database schema of e-Tourist also failed to make use of several data structures offered by the Post- greSQL database engine that would make some queries sim- pler and more performant. The main problem with the database however was the lack of an initial migration that would create the relations required in the application. The authors assumed that future developers would use a copy of a pre created database for that, an approach that made cre- ating a new development environment as well as deployment Figure 1: an unnecessarily complicated task. e-Tourist trip planning interface As it turns out, the Because e-Tourist database stores each point e-Tourist does not comply with REST best prac- of interest in a tree-based data structure, based on its lo- tices, it is quite hard to use programmatically, which is es- cation – a point-of interest belongs to (or is a child in a sential for us since most of it is going to be used by the tree-based structure of) a settlement that in turn belongs Rocket.Chat bot, a program. The latter also means that to a region. This proves to be a very brittle way for storing session cookies are not a viable way for us to check user cre- larger amounts of points where some of the settlements have dentials and a different way of authenticating calls to the the same names and some areas of interest belong to multi- API is needed. ple settlements and even regions. Also the e-Tourist does not make it possible to ask queries based on actual distance, Finally the code used some features of an older version of the like “Which restaurants are within 1km of lake Bohinj?”. framework that were discarded in the newer ones. In order Therefore, a rewrite of the location aware search aspects of to be able to guarantee long term support for the project we the application would be required. decided that a newer framework version with longer support was required. In terms of the recommendation system e-Tourist requires users to rate the points of interest on a 1 to 5 scale or to Tallying up all of the above we figured it would take more input a lot of data about themselves like their education, work to make the necessary changes to the old system then age, gender so forth. It also requires data about points of writing a new one from scratch with an eye out for code interest that would be hard to acquire in an automated way, reuse whenever possible. like which age groups a point of interest is most appealing to or the level of education of the people who most enjoy their 3. FEATURES visit to the point of interest. Since the implementation of the In addition to the features already mentioned at the start of e-Tourist several modules have been developed that imple- the Section 2, the new e-Tourist2.0 supports: ment various recommendation algorithms and were released as open source software2. For reasons of stability, security 1. Finding similar points of interest and ease of maintenance we opted for one of those instead of using our own recommendation engine implementation. 2. A user interface that allows quickly adding points to a trip plan and deleting them from it by using a map According to the code analysis we reached the following con- clusions: 3. Exporting trips to Google Maps 4. Full text search for points of interest based on their 1. A recommendation system needs to be rewritten to descriptions, and limited by their location also include implicit information about user interaction 5. A recommendation engine that uses implicit data about with the system. user’s interaction with the system to find points that 2https://github.com/benfred/implicit/, http: might interest a particular user based on the user’s //surpriselib.com/ history 21 We will continue with a discussion of the features, what they 4. e-Tourist2.0 ARCHITECTURE do and why we need them. In order to make it easier to port code from e-Tourist to e-Tourist2.0 we chose to implement e-Tourist2.0 in the same framework as e-Tourist, Django5. Besides allowing us 3.1 Trip planning to more easily port code from the old project it also comes Trip planning means that given a list points of interest a with a built in administration interface that made imple- potential tourist wants to visit, the system can plan a route menting the user interface for tourist information providers on a map that visits all the points listed by the tourist. a lot easier, since we just had to customise the one provided The new e-Tourist2.0 does this by using the open source with the framework. routing machine [2] which uses open street map3data to plan the route. In addition to being able to plan the route, Our data storage is provided by the PostgreSQL6 database. e-Tourist2.0 is also capable of exporting that route to a By using the Postgis7 extension we were able to save geo- Google Maps link, so that users may conveniently follow graphical data and use it for several spatial queries. It also along using the Google Maps app on their devices. The trip supports full text search. In order to enable support for the planning user interface will also display potentially inter- Slovenian language we provided some language specific con- esting attractions near the route already chosen so that a figuration and files, while the German and English languages tourist may quickly add them to their trip plan. are supported out-of-the-box. 3.2 Recommendation 5. CONCLUSIONS Recommendations are a way to present the users with more We have presented a short description of the new of the relevant content based on their interests as shown e-Tourist2.0, describing the need for a new trip planner through their history of using the system. This will also implementation, its features and its architecture. While allow a registered user to simply ask the system “What is some of the e-Tourist code base was reused, the new interesting in Koroška?”, as well as provide additional sug- e-Tourist2.0 is mostly a new program. Most of the fea- gestions along the planned path. tures presented here are already fully functional, however, the program is not yet entirely complete and changes to Another use of a recommendation system is to compare at- existing features, or additional features are possible in the tractions based on user behaviour. This allows us to find future. similar attractions to the one picked by the user. Another way to find similar attractions is by comparing their descrip- 6. ACKNOWLEDGMENTS tions and e-Tourist2.0 uses both of them. This feature We thank students Tadej Petrič, Aljaž Glavač and Martin enables the users to quickly narrow in on what they want to Češnovar, who contributed to e-Tourist2.0 development. see or to just explore their options more conveniently. The work was co-funded by Cooperation Programme Inter- reg V-A Slovenia-Austria 2014-2020, project AS-IT-IC. 3.3 Full text search 7. REFERENCES Full text search is a way to quickly search a large database of [1] B. Cvetković, H. Gjoreski, V. Janko, B. Kaluža, documents for the ones containing given words or phrases. A. Gradišek, M. Luštrek, I. Jurinčič, A. Gosar, In e-Tourist2.0 this is coupled with some sentence anal- S. Kerma, and G. Balažič. e-turist: An intelligent ysis that attempts to produce more relevant results. It personalised trip guide. Informatica, 40(4):447, 2016. searches through attraction descriptions in German, English [2] D. Luxen and C. Vetter. Real-time routing with and Slovenian. openstreetmap data. In Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in All attractions and geographical features in e-Tourist2.0 Geographic Information Systems, GIS ’11, pages carry complete information about their location. Complete 513–516, New York, NY, USA, 2011. ACM. in the sense that regions and settlements are saved as poly- gons describing their borders, rivers are saved as lines de- scribing their entire flow and so on. This enables all sorts of location based search queries, as well as constraining other queries to certain locations. Examples of such queries are “List all museums in Slovenj Gradec” and “Find all attrac- tions similar to the Bled castle near Klagenfurt”. The most important feature from the standpoint of the tourist providers is the administration interface, which will allow them to add new points of interest to the system and make corrections to the data on those already there. To request administration access, a tourism provider should fill in a form available on the e-Tourist2.0 website4. 3 5 https://www.openstreetmap.org/ https://www.djangoproject.com/ 4 6 https://e-turist.docker-e9.ijs.si/ https://www.postgresql.org/ tourism-provider/request-admin 7https://postgis.net/ 22 Planning-based Security Testing for Chatbots Josip Bozic Franz Wotawa Institute for Software Technology Institute for Software Technology Graz University of Technology Graz University of Technology A-8010 Graz, Austria A-8010 Graz, Austria jbozic@ist.tugraz.at wotawa@ist.tugraz.at ABSTRACT natural language patterns, intelligent agents can be trained Chatbots are of increasing importance in modern day com- to learn from the communication with clients by relying on munication between users and industrial applications. For machine learning. In the long term, these smart chatbots example, providers of financial and medical institutions make refine their responses and provide better answers over time. use of intelligent agents in order to provide accessibility on a Advanced chatbots offer the client the possibility to authen- 24/7 basis. The human-like communication, often realized in ticate herself or himself, thus personalizing the connection an entertaining way, represents one of these advantages that between machine and person. Here the chatbot stores poten- chatbots offer. Eventually, chatbots make use of artificial tially sensible data from the user. In this case, the chatbot intelligence methods in order to learn from past communi- must guarantee user authenticity and data integrity. The cation interactions, to provide better and more personalized failure to fulfill such promises might result in personal and responses. Often chatbots are deployed as part of web ap- financial consequences for the user. plications. As a consequence, this makes them vulnerable to typical security attacks on websites. Planning-based tech- This requirement and the fact that chatbots can be deployed niques can help to identify security leaks for common attack as part of web applications leads to security issues, since web scenarios in a smart way. In this paper, we present such applications are known to be vulnerable to several hack- an approach that relies on artificial intelligence planning for ing attacks. Vulnerabilities like SQL injections (SQLI) and security testing of chatbots that are accessible using web cross-site scripting (XSS) are still very common [4], despite applications. security measures and new security testing approaches. It is interesting to know that the security issue in the context of Categories and Subject Descriptors chatbots have almost not been considered before. C.2 [Computer-Communication Networks]: General— Automated planning and scheduling, or simply planning, is Security and protection; D.2 [Software Engineering]: Test- a branch of artificial intelligence that was initially used in ing and Debugging; I.2 [Artificial Intelligence]: Distributed robotics and intelligent agents [17]. There planning is used Artificial Intelligence—Intelligent agents for guiding an agent by responding to encountered condi- tions. Some approaches have applied planning to security General Terms testing in specific domains as well [14, 16]. In order to Theory, Security contribute to the mentioned security issues, we introduce an automated planning-based security testing approach for Keywords chatbots in this paper. Planning, security testing, chatbots The paper is organized as follows. Section 2 introduces plan- ning to security testing of chatbots. Then, Section 3 gives 1. INTRODUCTION an overview about a test execution framework. Finally, Sec- ELIZA [21] is first known computer program that commu- tion 4 concludes the work and discusses potential goals for nicated in a natural language with a person, which was de- the future. veloped in 1966. Over the years, further improvements have been added to such similar applications, called chatbots [9, 20]. Chatbots are deployed in a stand-alone or online fash- ion, i.e., as part of websites in form of virtual assistants. 2. PLANNING FOR CHATBOTS Such programs offer the advantage of human-like communi- Planning has been already used in security testing to a small cation and as well an almost unlimited accessibility. Deploy- degree. The authors of this paper have applied planning for ing chatbots offer financial advantages for service providers testing of web applications (e.g. [7]) and the TLS proto- as well. They can be used in order to respond to customers’ col [6]. However, the application to chatbots is novel in inquiries, e.g., provide information about certain goods or this sense. The main motivation behind using planning for services without the need of human intervention. testing is the fact that attacks against applications can be depicted in form of a sequence of steps that could be applied In contrast to the initial versions of chatbots, where the pro- against every program. Actually, a plan acts as a blueprint grams usually responded to an inquiry according to stored for an attack. In this paper, we applied the planning speci- 23 fication from our previous work [5] but use it for identifying in it - s t a t u s two - status - si sq li vulnerabilities in the case of chatbots. xss - typ e get p ost h ead - m e t h o d u s e r n a m e - u s e r n a m e In general, the planning problem was initially given in [11] p a s s w o r d - p a s s w o r d ) and can be defined as follows. ( : p r e d i c a t e s ( i n I n i t i a l ? x ) ( i n A d d r e s s e d ? x ) Definition 1. A planning problem is defined by the tuple ( i n S e n t R e q ? x ) (I, G, A). A state is defined by a set of first order logic pred- ( i n R e c R e q ? x ) icates. I represents the initial state, the goal state is G and ( i n S Q L I ? x ) the set of actions is given by A. Every action a ∈ A com- ( i n X S S ? x ) prises a precondition and an effect. The functions pre(a) ( i n A t t a c k e d S Q L ? x ) and ef f (a) connect the individual preconditions and effects, ( i n A t t a c k e d X S S ? x ) respectively. ( i n F o u n d ? x ) ( E m p t y ? url ) ( F o u n d S c r i p t ? s c r i p t - s c r i p t ? re sp - If the precondition pre(a) of an action a is satisfied the cur- r e s p o n s e ) ) rent state S, then this action will be selected for the solution ( : a c t i o n S t a r t of the planning problem. The execution of this action will a : p a r a m e t e r s ( lead to a new state S0, namely S − → S0. This procedure ? x - s t a t u s will continue until the execution reaches the goal state G, ? url - a d d r e s s ) i.e. fulfills its preconditions. The program that reads the : p r e c o n d i t i o n ( and planning specification and searches for a solution according ( i n I n i t i a l ? x ) to a planning algorithm is called a planner. ( not ( E m p t y ? url ) ) ) : e f f e c t ( and ( i n A d d r e s s e d ? x ) Definition 2. The solution for the planning problem (I, G, A) is returned in form of a plan, which is given by a sequence ( not ( i n I n i t i a l ? x ) ) ) ) a a an−1 ( : a c t i o n S e n d R e q of actions ha 1 2 1, . . . , ani such that I −→ S1 −→ . . . −−−→ : p a r a m e t e r s ( a S n n−1 − − → G. ? x - s t a t u s ? se - status - se ? si - status - si ) The planning problem is implemented in the Planning Do- : p r e c o n d i t i o n ( i n A d d r e s s e d ? x ) main Definition Language (PDDL) [15]. Here, two specifi- : e f f e c t ( and cations have to be provided: ( i n S e n t R e q ? x ) ( not ( i n A d d r e s s e d ? x ) ) • Domain definition: Data that is present for every prob- ( a s s i g n ( s ent ? se ) 1) lem definition. ( s t a t u s i n i t two ) ) ) ( : a c t i o n P a r s e R e s p X S S C h e c k • Problem definition: Data that defines one specific prob- : p a r a m e t e r s ( lem. ? x - s t a t u s ? s c r i p t - s c r i p t ? re sp - r e s p o n s e ) PDDL supports a type-object hierarchy of data and uses : p r e c o n d i t i o n ( and it in conjunction with first-order logic predicates. Every ( i n R e c R e s p R X S S ? x ) object correspondents to a specific type, which relates to ( not ( F o u n d S c r i p t ? s c r i p t ? re sp ) ) ) variables and classes in object-oriented programming, re- : e f f e c t ( and spectively. The individual action definitions are built from ( F o u n d S c r i p t ? s c r i p t ? r esp ) parameters and pre- and postconditions, which are defined ( i n F o u n d ? x ) with one or more predicates. For example, an excerpt from ( not ( i n R e c R e s p R X S S ? x ) ) ) ) ) the the domain definition for chatbot testing is depicted be- low. Domain description in PDDL ( d e f i n e ( d o m a i n c h a t d o m a i n ) As can be seen, the PDDL definition encompasses, among ( : r e q u i r e m e n t s others, types, predicates and actions. Again, the individual : s t r i p s : t y p i n g : e q u a l i t y : f l u e n t s action definitions make use of the predicates and apply pa- : adl ) rameters in order to check if the predicate is valid. The spec- ( : t y p e s ification uses the parameter x to denote the current state of s t a t u s a d d r e s s s e r v e r status - si execution. As mentioned, the above domain, due to space status - se t ype e x p e c t r e s u l t u s e r n a m e reasons, does not include our entire specification. On the p a s s w o r d a c t i o n m e t h o d i n t e g e r sq li other hand, the problem definition is defined as follows. xs si r e s p o n s e s c r i p t ) ( : c o n s t a n t s 24 order to test for vulnerabilities. For example, model-based ( d e f i n e ( p r o b l e m c h a t p r o b l e m ) approaches usually rely on a model of the SUT [19, 18], ( : d o m a i n c h a t d o m a i n ) whereas fuzzing and combinatorial testing put emphasis on ( : o b j e c t s test case generation from a pentesting aspect [10, 13]. x - s t a t u s s - s e r v e r SQLI and XSS represent two common vulnerabilities for url - a d d r e s s many years and need further addressing for this reason. De- m - m e t h o d tailed information about these two vulnerabilities can be exp - e x p e c t found in [8] and [12], respectively. Chatbots, as already s c r i p t - s c r i p t mentioned, when deployed as part of a web application, in- re sp - r e s p o n s e ) herit the vulnerabilities as well. A scenario that depicts the ( : i nit entire planning-based security testing system is depicted in ( i n I n i t i a l x ) Figure 1. ( not ( E m p t y url ) ) ( M e t h o d pos t ) Attack vectors are malicious input strings that an attacker ( R e s p o n s e res p ) or tester submits against an application. For XSS, the list of ( not ( F o u n d S c r i p t s c r i p t res p ) ) attack vectors consists of JavaScript code, whereas malicious ( : g oal SQL statements are used for SQLI. As already mentioned in ( i n F i n a l x ) ) ) Section 2, a generated plan is used as an abstract test case. The reason for this is the fact that PDDL is limited with Problem description in PDDL regard to setting of concrete values for parameters. For this reason, we define a test execution framework, that encom- passes, among others, an executioner. This framework is The problem definition comprehends the definition of ob- implemented in Java and contains concrete Java methods jects and, most important, the initial state. This state rep- that correspond to the individual actions from PDDL. More resents the starting point from which the planner will start information about this mechanism can be found in [5]. the search. Modification of the initial state will result in the generation of a different plan. If no plan can be generated, The executioner reads the abstract plan and searches for the then the planner returns an error. A generated plan looks concrete counterpart of the individual actions. Then, HTTP as follows: requests are created with the help of HttpClient [1] and in- 0: S T A R T X URL stantiated with an attack vector. Then, the attack is carried 1: S E N D R E Q X SE SI out in form of the HTTP request. The SUT is a deployed 2: R E C R E Q X SI chatbot that encompasses a database. The chatbot has a 3: P A R S E X M U S E R N A M E P A S S W O R D T YPE user input field, e.g. an HTML element for textual inputs, 4: C H O O S E X S S X TY PE that represents the target for the executioner. The test or- 5: A T T A C K X S S X XS SI M UN PW acles specify what test output is expected and provide the 6: P A R S E R E S P X S S X S C R I P T RE SP final test verdict. We rely on our previously implemented 7: P A R S E R E S P X S S C H E C K X S C R I P T RE SP oracles from [7] for this purpose. After an attack, a parser 8: F I N I S H X reads the response from the SUT. It searches the HTML structure for critical vulnerability indicators, as specified in the oracles. In this scenario, we rely on jsoup [2]. The test- Generated plan for XSS ing process continues as long as the plan has been executed for every attack vector. As mentioned before, the plan is represented by a sequence of actions and corresponding parameters picked from the do- 4. CONCLUSION main definition. In our case, we used the planner Metric-FF In this paper, we introduced a security testing approach for [3]. Now, this sequence of steps acts as an abstract test case chatbots that relies on planning. After manually defining that will be executed by an executioner against the system the specification and generating the plan, a test execution under test (SUT). The purpose of the plan is to guide the implementation executes the plan in an automated manner. test execution process that, in the best case, will lead to the The approach is meant for testing of chatbots against two detection of a vulnerability. The main idea here is to apply common web vulnerabilities, namely XSS and SQLI. Under this plan against every chatbot that corresponds to the sce- the assumption that chatbots will play a major role in the nario as described in the next section. future, it remains important to address this issue. In the future, the proposed security testing approach will be evalu- ated against real-world applications and compared to other testing techniques. 3. SECURITY TESTING OF CHATBOTS Security plays a major role for every software system. Fail- ure to fulfill security requirements might lead to severe pri- Acknowledgments vate, financial and reputation consequences. For this rea- The research presented in the paper has been funded in son, programs have to be tested during the development part by the Cooperation Programme Interreg V-A Slovenia- lifecycle and after release of the software. Until now, many Austria under the project AS-IT-IC (Austrian-Slovene In- manual and automated approaches have been introduced in telligent Tourist Information Center). 25 Figure 1: Planning-based Chatbot Security Testing 5. REFERENCES Problem Solving. In Artificial Intelligence, pages [1] Apache HttpComponents - HttpClient. https: 189–208, 1971. //hc.apache.org/httpcomponents-client-ga/. [12] S. Fogie, J. Grossman, R. Hansen, A. Rager, and P. D. Accessed: 2018-09-06. Petkov. XSS Attacks: Cross Site Scripting Exploits [2] jsoup: Java HTML Parser. https://jsoup.org/. and Defense. Syngress, 2007. Accessed: 2018-02-02. [13] D. Kuhn, D. Wallace, and A. Gallo. Software Fault [3] Metric-FF. https://fai.cs.uni-saarland.de/ Interactions and Implications for Software Testing. In hoffmann/metric-ff.html. Accessed: 2018-09-06. IEEE Transactions on Software Engineering 30 (6), [4] OWASP Top Ten Project. 2004. https://www.owasp.org/index.php/Category: [14] A. Leitner and R. Bloem. Automatic Testing through OWASP_Top_Ten_Project. Accessed: 2018-01-31. Planning. Technical report, Technische Universität [5] J. Bozic and F. Wotawa. Plan It! Automated Security Graz, Institute for Software Technology, 2005. Testing Based on Planning. In Proceedings of the 26th [15] D. McDermott, M. Ghallab, A. Howe, C. Knoblock, IFIP International Conference on Testing Software A. Ram, M. Veloso, D. Weld, and D. Wilkins. PDDL - and Systems (ICTSS’14), pages 48–62, September The Planning Domain Definition Language. In The 2014. AIPS-98 Planning Competition Comitee, 1998. [6] J. Bozic and F. Wotawa. Planning the Attack! Or [16] A. M. Memon, M. E. Pollack, and M. L. Soffa. A How to use AI in Security Testing? In Proceedings of Planning-based Approach to GUI Testing. In First International Workshop on AI in Security Proceedings of the 13th International Software / (IWAIse), 2017. Internet Quality Week (QW’00), 2000. [7] J. Bozic and F. Wotawa. Planning-based Testing of [17] S. J. Russell and P. Norvig. Artificial Intelligence: A Web Applications. In Proceedings of the 13th Modern Approach. Prentic Hall, 1995. International Workshop on Automation of Software [18] I. Schieferdecker, J. Grossmann, and M. Schneider. Test (AST’18), 2018. Model-Based Security Testing. In Proceedings of the [8] J. Clarke, R. M. Alvarez, D. Hartley, J. Hemler, Model-Based Testing Workshop at ETAPS 2012. A. Kornbrust, H. Meer, G. OLeary-Steele, A. Revelli, EPTCS, pages 1–12, 2012. M. Slaviero, and D. Stuttard. SQL Injection Attacks [19] M. Utting and B. Legeard. Practical Model-Based and Defense. Syngress, Syngress Publishing, Inc. Testing - A Tools Approach. Morgan Kaufmann Elsevier, Inc., 30 Corporate Drive Burlington, MA Publishers Inc., 2006. 01803, 2009. [20] R. S. Wallace. The Anatomy of A.L.I.C.E. In ALICE [9] K. Colby. Artificial Paranoia: A Computer Simulation A.I. Foundation, 2004. of Paranoid Process. In Pergamon Press, New York, [21] J. Weizenbaum. ELIZA–A Computer Program For the 1975. Study of Natural Language Communication Between [10] F. Duchene, S. Rawat, J.-L. Richier, and R. Groz. Man and Machine. In Communications of the ACM KameleonFuzz: Evolutionary Fuzzing for Black-Box Volume 9, Number 1 (January 1966), 1966. XSS Detection. In CODASPY, pages 37–48. ACM, 2014. [11] R. E. Fikes and N. J. Nilsson. STRIPS: A New Approach to the Application of Theorem Proving to 26 Indeks avtorjev / Author index Bozic Josip ................................................................................................................................................................................... 23 Gams Matjaž .................................................................................................................................................................................. 5 Grasselli Gregor ............................................................................................................................................................... 10, 14, 20 Mahnič Blaž ................................................................................................................................................................................. 14 Tazl Oliver August ................................................................................................................................................................. 14, 18 Wotawa Franz ........................................................................................................................................................................ 18, 23 Zupančič Jernej .................................................................................................................................................................. 5, 10, 14 27 28 Konferenca / Conference Uredila / Edited by Delavnica AS-IT-IC / AS-IT-IC Workshop Matjaž Gams, Jernej Zupančič Document Outline 01 - Naslovnica-sprednja-E 02 - Naslovnica - notranja - E 03 - Kolofon - E 04 - 05 - IS2018 - Skupni del 07 - Kazalo - E 08 - Naslovnica podkonference - E 09 - Predgovor podkonference - E 10 - Programski odbor podkonference - E 11 - Clanki - E 01_Zupancic-Gams_AS-IT-IC-report 02_Grasselli-Zupancic_Tourism-tools 03_Zupancic-Tazl-Mahnic-Grasselli_AS-IT-IC-databases.pdf 04_Tazl-Wotawa_Content-API.pdf 05_Grasselli_e-Tourist2 Introduction Why a New System Features Trip planning Recommendation Full text search e-Tourist2.0 architecture Conclusions Acknowledgments References 06_Bozic-Wotawa_Security-testing-chatbots 12 - Index - E 13 - Naslovnica-zadnja-E Blank Page Blank Page Blank Page Blank Page Blank Page 04 - 05 - IS2018 - Predgovor in odbori.pdf 04 - IS2018 - Predgovor 05 - IS2018 - Konferencni odbori