Proceedings of the 8th Student Computing Research Symposium (SCORES’22) Ljubljana, Slovenia October 6, 2022 Uroš Čibej Luka Fürst Lovro Šubelj Jure Žabkar (Eds.) https://www.scores.si Proceedings of the 8th Student Computing Research Symposium (SCORES’22) Ljubljana, Slovenia October 6, 2022 Uroš Čibej Luka Fürst Lovro Šubelj Jure Žabkar (Eds.) Kataložni zapis o publikaciji (CIP) pripravili v Narodni in univerzitetni knjižnici v Ljubljani COBISS.SI-ID=126673155 ISBN 978-961-7059-11-3 (Fakulteta za računalništvo in informatiko, PDF) Digitalna izdaja je prosto dostopna This digital publication is freely available http://zalozba.fri.uni-lj.si/SCORES2022.pdf DOI: https://doi.org/10.51939/scores22 Založnik: Založba UL FRI, Ljubljana Izdajatelj: UL Fakulteta za računalništvo in informatiko, Ljubljana Urednik: prof. dr. Franc Solina Copyright © 2022 Založba UL FRI. All rights reserved. Title Proceedings of the 8th Student Computing Research Symposium (SCORES’22) Editors Uroš Čibej (University of Ljubljana, Faculty of Computer and Information Science) Luka Fürst (University of Ljubljana, Faculty of Computer and Information Science) Lovro Šubelj (University of Ljubljana, Faculty of Computer and Information Science) Jure Žabkar (University of Ljubljana, Faculty of Computer and Information Science) Conference 8th Student Computing Research Symposium (SCORES’22) Venue University of Ljubljana Faculty of Computer and Information Science Večna pot 113, SI-1000 Ljubljana, Slovenia Date October 6, 2022 Program Committee Klemen Berkovič (University of Maribor) Zoran Bosnić (University of Ljubljana) Janez Brest (University of Maribor) Lucija Brezočnik (University of Maribor) Andrej Brodnik (University of Ljubljana) Patricio Bulić (University of Ljubljana) Jani Dugonik (University of Maribor) Iztok Fister (University of Maribor) Mario Gorenjak (University of Maribor) Vida Groznik (University of Ljubljana) Branko Kavšek (University of Primorska) Štefan Kohek (University of Maribor) Matjaž Krnc (University of Primorska) Niko Lukač (University of Maribor) Uroš Mlakar (University of Maribor) Jan Popič (University of Maribor) Peter Rogelj (University of Primorska) Aleksander Sadikov (University of Ljubljana) Domen Šoberl (University of Primorska) Damjan Vavpotič (University of Ljubljana) Grega Vrbančič (University of Maribor) Slavko Žitnik (University of Ljubljana) Organizing Committee Uroš Čibej (University of Ljubljana) Luka Fürst (University of Ljubljana) Lovro Šubelj (University of Ljubljana) Jure Žabkar (University of Ljubljana) Published by University of Ljubljana Faculty of Computer and Information Science Večna pot 113, SI-1000 Ljubljana, Slovenia https://www.fri.uni-lj.si/en, dekanat@fri.uni-lj.si Co-published by University of Maribor Faculty of Electrical Engineering and Computer Science Koroška cesta 46, SI-2000 Maribor, Slovenia https://feri.um.si/en/, feri@um.si Co-published by University of Primorska Faculty of Mathematics, Natural Sciences and Information Technologies Glagoljaška ulica 8, SI-6000 Koper, Slovenia https://www.famnit.upr.si/en, info@famnit.upr.si Edition 1st Publication type E-book Published Ljubljana, Slovenia, October 2022 © University of Ljubljana, Faculty of Computer and Information Science This book is published under a Creative Commons 4.0 International licence (CC BY 4.0). This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. Any third-party material in this book is published under the book’s Creative Commons licence unless indicated otherwise in the credit line to the material. If you would like to reuse any third-party material not covered by the book’s Creative Commons licence, you will need to obtain permission directly from the copyright holder. https://creativecommons.org/licenses/by/4.0/ Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Editors’ Foreword In computer science, there is certainly no lack of scien- paper had been examined by at least two reviewers. The au-ti�c gatherings. Even in a comparatively small country such thors come from a diverse set of academic and non-academic as Slovenia, multiple conferences are held annually. How- institutions, namely University of Ljubljana, University of ever, a vast majority of those events are aimed at a general Maribor, University of Primorska, Hongik University (Re-scienti�c community, which predominantly consists of pro- public of Korea), Rhodes College (USA), University Medical fessional researchers and PhD students. Students at the BsC Center Ljubljana, and the Ministry of the Interior of the Re-or MsC level seldom engage in potentially publishable re- public of Slovenia. The Best Paper Award was given to Ben-search, and even those who do rarely decide to put their jamin Džubur, Žiga Trojer, and Urša Zrimšek, for their paper work in print, be it fear of rejection or unfamiliarity with Semantic analysis of Russo-Ukrainian war tweet networks. Ina publication venues. However, if they were o�ered an oppor- Bašić, Eric Gottlieb, and Matjaž Krnc, the authors of the pa-tunity to present their work at a conference that took into per Some observations on the column-row game, received the account their relative lack of experience but nevertheless up- Best Presentation Award. Congratulations! held well-established criteria of acceptance (scienti�c nov-The papers that we received via the submission system, elty, referencing prior work, quality of presentation, etc.), the reviews prepared by the members of our Program Com-they would be more likely to take up research during their mittee, and the presentations given at the conference itself BsC and MsC years, to publish the results of their work, not only dispelled any concerns regarding quality but actu-and, ultimately, to pursue a scienti�c career. Furthermore, ally thoroughly exceeded our expectations. Having attended a student-centered conference might entice the ‘uninitiated’ many conferences in our local environment and abroad, we audience to follow the suit. Needless to say, bringing stu- dare say that quite a few of them were rivaled, if not sur-dents with di�erent interests and from di�erent backgrounds passed, by what we saw at SCORES’22. Nevertheless, we are together would create a melting pot for all sorts of ideas, in- certain that there is still room for improvement, and we have cluding those that can be potentially developed into future no doubts that SCORES’23 will push the envelope even fur-research papers. ther. Formerly known as StuCoSReC (Student Computer Sci- ence Research Conference), SCORES is one of the few con- We are immensely grateful to our reviewers, the mem- ferences explicitly targeting the BsC and MsC student pop- bers of the Program Committee; to David Nabergoj and Vit-ulation. Even more, each paper must be authored by at least jan Zavrtanik, our PhD students who gave engaging talks one BsC or MsC student. Ever since its inception in 2014, the on two cutting-edge research topics; to Prof. Franc Solina, conference has been jointly organized by the computer sci- who took care of everything that was necessary to have these ence departments of the University of Ljubljana, the Univer- proceedings published; to the people from Communication sity of Maribor, and the University of Primorska, although it O�ce and Computer Center at our Faculty, who were ex-is customary that every year one of them is selected as the tremely helpful throughout the process; to Iztok Fister from primary organizer. In the year 2022, this role was played by the University of Maribor, who told us all we had to know the University of Ljubljana (Faculty of Computer and Infor- to get things started; and, last but not least, to Dewesoft and mation Science). Wisdom Labs, Slovenian companies that generously funded At SCORES’22, we accepted 12 papers and arranged them not only the prizes awarded at SCORES’22 but ... quite pos-into three sessions of four papers each: Arti�cal Intelligence sibly those to be given at SCORES’23 and SCORES’24, too. and Machine Learning, Algorithmics and Theoretical Com- We sincerely apologize to all those whom we forgot to puter Science, and Applications of Computer Science. Each mention. It was certainly not done on purpose. Uroš Čibej Luka Fürst Lovro Šubelj Jure Žabkar v Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 vi Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Conference Program Arti�cial Intelligence, Machine Learning, and Pattern Recognition 1 1 Use of network features to improve 3D object classi�cation Gal Petkovšek, Janez Božič, and Marija Marolt 5 E�cient machine learning based graph layout calculation for investigation platform Alen Granda, Niko Lukač, Aleksander Pur, and Štefan Kohek 9 Music genre classi�cation based on spectrograms of the sound recording using an ensemble of CNNs Tadej Lahovnik and Vili Podgorelec 13 Super-resolution method for reconstructing street images from surveillance system based on Real-ESRGAN Quoc Toan Nguyen Algorithmics and Theoretical Computer Science 17 17 An analysis of time complexity and a discussion on interpretability for two methods of constructing social network graphs Žan Jonke 21 Empirical evaluation of sequential, parallel and distributed implementations of k-means clustering Andrej Perković and Aleksandar Tošić 25 Exact exponential algorithms for the center closing problem Samo Metličar and Jurij Mihelič 29 Some observations on the column-row game Ina Bašić, Eric Gottlieb, and Matjaž Krnc Applications of Computer Science 33 33 Semantic analysis of Russo-Ukrainian war tweet networks Benjamin Džubur, Žiga Trojer, Urša Zrimšek 37 Performance evaluation of the SloBERTa and XML-RoBERTa transformer models on the Slovenian news corpus SentiNews 1.0 Martin Domajnko and Jakob Kordež 41 Spletna aplikacija za analizo tapkanja s prsti Filip Zupančič, Gal Žagar, Dejan Georgiev, and Jure Žabkar 45 Iterated prisoner’s dilemma and survival of the �ttest from an ecological perspective Martin Domajnko Index of Authors 49 vii Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 viii Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Use of network features to improve 3D object classification Gal Petkovšek Janez Božič Marija Marolt gp19194@student.uni-lj.si jb1236@student.uni-lj.si mm7522@student.uni-lj.si Faculty of computer science, Faculty of computer science, Faculty of computer science, University of Ljubljana Večna pot 113 University of Ljubljana Večna pot 113 University of Ljubljana Večna pot 113 SI-1000 Ljubljana, Slovenia SI-1000 Ljubljana, Slovenia SI-1000 Ljubljana, Slovenia ABSTRACT In our work we tackle a problem of 3D object classi�cation, which is a task traditionally closer to computer graphics rather than net- work analysis. There are several di�erent approaches for solving this problem including deep learning, topological data analysis, graph theory etc. We use the Mapper algorithm that transforms the point cloud into a graph, which simpli�es the data while obtaining the key properties of the structure. This algorithm is already used to solve such a task however, the features extracted from the graph Figure 1: Chair point cloud and its network made with Map- were very limited. The novelty that we introduce is the calculation per algorithm of some network properties on such graphs which are used for classi�cation. The results show that the models have better classi- �cation accuracy scores when using network analysis attributes represent the object in the simplest way possible, while still keeping in addition to attributes from topological analysis however, it is the essential information required for the classi�cation. From this challenging to determine exactly which features will perform well relatively simple representation of an object we can compute some for a classi�cation of objects. topological properties of the structure or some properties from network analysis that would be useful for classi�cation. The options KEYWORDS for attributes are quite extensive and in this work we limit ourselves to three topology features and 14 network analysis features. We then network analysis, 3D object classi�cation, Mapper algorithm, fea- test di�erent subsets of features to determine which combination ture selection is most e�ective for classi�cation. There are a lot of di�erent approaches of either describing the 1 INTRODUCTION data or doing the actual classi�cation of 3D objects. A paper from 2016 [8] describes an approach that takes all the data points and 3D object recognition is one of the most challenging �elds in object feeds them to a neural network. Another approach [11] consider the recognition community. The goal of the task is for a model to data as a set of 2D images. Another research [4] uses a technique di�erentiate between di�erent 3D objects that are presented to it called topology matching on graphs. Some authors [3, 6, 11, 13] in a certain format. also describe the use of Convolutional Neural Networks (CNNs) for In 3D computer graphics a polygon mesh is a collection of ver- solving such problems. Instead of CNN we will be using Support tices, edges and faces that de�nes a polyhedral object [1] (however vector machine (SVM), as in the article [5] Anupama in this work we consider only vertices positioned in a 3D space - et. al. The Mapper algorithm that we are using to transform the data was �rst from now on this will be referred to as point cloud). Because the described in the article by Singh process of scanning objects and saving them in such format has et. al [10] however, for classi�cation they limited themselves to the outputs of the Mapper algorithm become very common, a need has developed to be able to automat- while we explore the network further. ically classify and visualise such data. An example can be seen on Figure 1. Given points together can human easily discern silhou- 2 METHODS ette as a chair. It takes a computer quite some time to process all the points and connect them intuitively, so we need to extract the 2.1 Problem de�nition necessary data from the point cloud, train a model, and then make We have tested our approach on a dataset that consists of four a prediction. classes (table, chairs, octopuses and spiders). We tackle this problem The �rst and one of the most fundamental challenges of this as a binary classi�cation (deducing whether a sample is e.g. a table task is choosing the best representation of such data to be inputted or not). More precisely we compare four di�erent classi�cation into the models for classi�cation. One approach is to transform the tasks (one binary classi�er for every one of the four classes) in data into a set of 2D images and feed them into the model [11] and order to determine how the use of network analysis features a�ect another would be to just use the whole set of points [8]. In this work the performance of models on di�erent target classes. we used a Mapper algorithm [10] to transform the set of points For every one of the four models we test every possible subset into a compact network. An example of such transformation can be of features that are described in Section 2.4. seen on Figure 1, where we can on the left side observe the original New models (with network theory features) are compared to point cloud and the extracted network on the right. The idea is to models with one homology feature. h�ps://doi.org/10.51939/scores22.01 1 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Gal Petkovšek, Janez Božič, and Marija Marolt 2.2 Data connections between the points. We measure when connected com- The data is obtained from McGill 3D Shape Benchmark. [14] It is a ponents appear and when they merge, representing their lifetime. repository of point clouds for testing 3D shape retrieval algorithms. In our case we use persistence entropy (entropy of the points in We use the mesh format where each object’s surface is provided in a persistence diagram later referred to as h1), and amplitudes of triangulated form. To provide versatility but also enough models in persistent diagrams, which plot lifetimes of connected components, every class, one class is containing goal object models and second computed with Wasserstein or Bottleneck distance as a classi�- class containing three other objects models. There are point clouds cation parameters (later referred to as h2 and h3). From network of 21 tables, 23 chairs, 25 spiders and 31 octopuses taken from theory we included the following features: McGill 3D Shape Benchmark repository (together 100 samples). Examples of the referred samples can be seen on Figure 2. (0) Number of articulation points (NA) - nodes that sepa- rate the notwork into multiple sub-networks. (1) Average degree of node in a network (h:i) (2) Network density (d) (3) Average clustering coe�cient (h⇠i) (4) Normalised number of nodes in top 10% closeness centrality (number of nodes having 90% or more of maximum node closeness centrality divided by number Figure 2: Examples of four types of objects in the database of nodes) (C>?10_l 1) (5) Normalised number of nodes in top 10% betweenness centrality (number of nodes having 90% or more of 2.3 Network construction maximum node betweeness centrality divided by num- ber of nodes) (C>?10_f) To create a network from data point cloud we use the Mapper (6) Normalised number of cliques with 4 nodes (number algorithm. It is composed of three main steps: the �ltering function, of cliques containing 4 nodes divided by number of the cover function and the clustering function [12]. After some nodes) (NClique) initial testing we chose the best working components. Example of (7) Degree assortativity coe�cient (DAC) the transformation is shown on �gure 1. (8) Normalised number of leaves (number of nodes with The main purpose of the �ltering function is to reduce the di- degree 1 divided by number of nodes) (Nl) mensionality of the data. In our case this function projects data (9) Maximum node closeness centrality ("0Gl 1) points de�ned in a 3D space onto the xy plane. (10) Maximum node betweenness centrality ("0Gf) The next step is to calculate groups of overlapping points with (11) Average node closeness centrality ( E6l 1) the help of the covering function. This function takes as input (12) Average node betweenness centrality ( E6f) the �ltered space, calculates overlapping groups with dividing and (13) Number of Louvain Communities (NLC) transform points from each group back to the original space. Over- lapped groups are computed with division of each input dimension by 3 equal-length intervals with fractional overlap of 0.15. Attributes such as number of articulation points, number of Finally, we apply a clustering algorithm on each group and calcu- leafs or number of louvain communities tell us a lot about the ar- late an undirected graph. This is done with a clustering algorithm ticulation parts of the structure which could be a very valuable that is able to determine the number of clusters in a dataset auto- information for the model. Average degree of a node and network matically or by manually de�ning it. For this purpose we use the density provide information on how many edges formed in the net- DBSCAN [7] algorithm where the maximum distance between two work which in other words is how many of clusters obtained from points for one to be considered as in the neighborhood of the other the mapper algorithm described in Section 2.3 share some points. is 10 using Chebyshev distance. Each cluster generates a node and Features including node centralities contain information about the the edges between them are created if the two clusters share some importance of speci�c nodes (either on average, maximum or just points. what proportion of nodes achieves the higher centralities). The idea The resulting network contains a lot less nodes than there are behind number of cliques and average clustering coe�cient is that points in the point cloud and is therefore more easily handled than they could indicate if an object would contain some larger even the original data representation. An example of such network can surface that would be more fully connected. be seen in Figure 1. Our goal is to determine the best combination of features, how- ever since we only have a very limited number of samples, we 2.4 Attribute calculation do not consider too large subsets of features (on which the model A novel practice in point cloud object recognition is applying dif- would only over�t and produce meaningless results). Therefore we ferent homology based attributes to the output of the mapper. A restrict the number of features to either 4 (if we also include some “continuous” shape is built on the top of the data in order to high- homology) or 6 otherwise. light the underlying topology or geometry. By using homology, we We extract most of the measures from Python library NetworkX [9], observe numbers of components, holes, and voids while adding and calculate the remaining with custom code using data from the 2 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Use of network features to improve 3D object classification library. Speci�cally average network clustering, number of articu- tables chairs octopus spiders Average lation points, assortativity coe�cient, number of louvain commu- h1 0.62 0.91 0.66 0.71 0.73 nities, betweenes and closeness centralities were calculated with h2 0.48 0.80 0.48 0.77 0.63 NetworkX functions. h3 0.84 0.60 0.62 0.81 0.72 NA 0.50 0.48 0.50 0.48 0.49 2.5 Learning h:i 0.86 0.43 0.45 0.79 0.63 For modeling we use Support Vector Machines (SVM) implementa- d 0.43 0.37 0.56 0.41 0.44 tion from SciKit Learn [2] library for Python with a linear kernel h⇠i 0.24 0.36 0.27 0.33 0.30 setting. Other hyperparameters of the model are set to their default C>?10_l 1 0.59 0.44 0.56 0.30 0.47 values as speci�ed in the SciKit Learn [2] documentation. C>?10_f 0.74 0.77 0.71 0.37 0.65 We evaluate our models with 10-fold cross validation to improve NClique 0.35 0.42 0.46 0.79 0.51 stability of the results. DAC 0.90 0.44 0.46 0.39 0.55 For a metric we use AUC (area under the ROC curve) which is a Nl 0.62 0.35 0.48 0.49 0.49 metric ranging from 0 to 1 (in practice from 0.5 to 1 as any classi�er "0Gl 1 0.92 0.43 0.37 0.67 0.60 scoring lower than 0.5 can be inverted, transitioning to AUC>0.5) "0Gf 0.43 0.58 0.57 0.50 0.52 with 1 being the most and 0 (in reality 0.5) being the best score. This E6l 1 0.92 0.40 0.56 0.50 0.60 metric was chosen since it shows how well the model can separate E6f 0.82 0.42 0.69 0.60 0.63 the samples into correct classes. It is calculated by �rst constructing NLC 0.47 0.49 0.52 0.50 0.50 the ROC curve, which shows the relation between true positive Table 1: The results of the model for each individual feature rate (TPR) and false positive rate (FPR). It is obtained by moving (only 1 feature used for classi�cation). the threshold and for each value calculating the TPR and FPR. After all points are gathered, the AUC can be calculated. 2.6 Experimetal setup • Octopuses The best features are NA, DAC and h3. AUC Experiments are conducted on the data set of labeled point clouds, score is 0.85 (AUC using only h1 was 0.66). described in Section 2.2. As we have 4 di�erent labels we conduct 4 • Spiders The best features are C>?10_f, Nl, "0Gl 1 and h2. independent experimets, each time choosing one label as positive AUC score is 0.89 (AUC using only h3 was 0.81). (object belongs to that group) and the other labels as negative (object does not belong to the chosen group). The best performing features are greatly dependent on the do- For each of those 4 experimets we �rst transform all the cloud main (speci�c object we want to classify), which is also seen in points to smaller networks using the mapper algorithm. For each Table 1. From our results of combination of best performing features one of those we then extract the features (both topological and we can see improvements over the baseline (using only homology features form network analysis). Once we have the labeled feature for classi�cation). Addition of network analysis attributes to the vectors we perform a grid search over all feature subsets (that feature vector improves classi�cation. However, based on the re- su�ces the size limitations presented in Section 2.4). For each of sults above we cannot determine which features in general perform these subsets we performe a 10-fold cross validation with the SVM best. model (each fold using 90 samples for training and 10 for testing). To determine this we �rst viewed the scores of models using The results are then saved and analysed. only single features to predict the class. Each column (classi�cation problem) has the three best results bolded. As we can see, we could 3 RESULTS not point out a few features that would be signi�cantly better than the others in all cases. The quality of each attribute is greatly The goal of our research is to determine if the additional network dependant on the class that we are trying to predict. This can features contribute to the scores and if so which features are the easily be explained as di�erent objects tend to produce di�erent most bene�cial to include. networks and each has di�erent properties that separate it from To determine whether adding additional features is even bene�- the rest. Some of the best performing features are C>?10_f, "0Gf cial for the predictions, we �rst take a look at the results of the best and "0Gl 1 however, while most of them perform quite well on models (models built on subsets of features that performed best ac- two classes they perform worse on the other two. More consistent cording to the AUC score). Combinations of attributes that achieve features that perform relatively well on most classes are for example the highest score in our test cases are displayed below. In addition C>?10_f and E6f. to the best performing subset of features and the corresponding To further test the importance of features we set a cuto� point at scores we included the results for best performing homology for the score of the model using the best performing homology feature each use case (same results as can be seen in Table 1). (for each classi�cation task) and only consider models with feature • Tables: The best features are "0Gl 1, E6l 1 and h3. AUC subsets that scored higher. For each feature we calculate in what score is 0.96 (AUC using only h3 was 0.84). percent of those models it appears. Those results are presented in • Chairs The best features are NA, h⇠i and NLC in combi- Figure 3. We can observe again that the results mostly di�er when nation with either C>?10_l 1 and "0Gf or h2. AUC score it comes to the observed class. More preferable are features that is 0.96 (AUC using only h1 was 0.9). perform well on all or most of the use cases. Such example would 3 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Gal Petkovšek, Janez Božič, and Marija Marolt 12 pages. https://doi.org/10.1145/2835487 [4] Masaki Hilaga, Yoshihisa Shinagawa, Taku Kohmura, and Tosiyasu L. Kunii. 2001. Topology Matching for Fully Automatic Similarity Estimation of 3D Shapes. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’01). Association for Computing Machinery, New York, NY, USA, 203–212. https://doi.org/10.1145/383259.383282 [5] Anupama Jawale and Ganesh Magar. 2019. Comparison of Image Classi�cation Techniques : Binary and Multiclass using Convolutional Neural Network and Support Vector Machines. (12 2019). [6] Alex Krizhevsky, Ilya Sutskever, and Geo�rey E Hinton. 2012. ImageNet Clas- si�cation with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems, F. Pereira, C.J. Burges, L. Bottou, and K.Q. Wein- berger (Eds.), Vol. 25. Curran Associates, Inc. https://proceedings.neurips.cc/ paper/2012/�le/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf [7] Jiirg Sander Martin Ester, Hans-Peter Kriegel and Xiaowei Xu. 1996. A Density- Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. [8] Charles Ruizhongtai Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2016. PointNet: Deep Learning on Point Sets for 3D Classi�cation and Segmentation. CoRR abs/1612.00593 (2016). arXiv:1612.00593 http://arxiv.org/abs/1612.00593 [9] Aric Hagberg Pieter Swart Dan Schult. 2022. NetworkX: Network analysis library. [10] Gurjeet Singh, Facundo Memoli, and Gunnar Carlsson. 2007. Topological Meth- Figure 3: Chair point cloud and its network made with map- ods for the Analysis of High Dimensional Data Sets and 3D Object Recogni- tion. In Eurographics Symposium on Point-Based Graphics, M. Botsch, R. Pa- per algorithm jarola, B. Chen, and M. Zwicker (Eds.). The Eurographics Association. https: //doi.org/10.2312/SPBG/SPBG07/091-100 [11] Hang Su, Subhransu Maji, Evangelos Kalogerakis, and Erik G. Learned-Miller. 2015. Multi-view Convolutional Neural Networks for 3D Shape Recognition. be h⇠i, Nl, E6l 1 or E6f. The latter two are by themselves a rela- CoRR abs/1505.00880 (2015). arXiv:1505.00880 http://arxiv.org/abs/1505.00880 tively good features. [12] Guillaume Tauzin, Umberto Lupo, Lewis Tunstall, Julian Burella Pérez, Matteo h⇠i is not as successful on its own but is often Caorsi, Wojciech Reise, Anibal Medina-Mardones, Alberto Dassatti, and Kathryn present in the most successful subsets. It indicates how connected Hess. 2021. giotto-tda: A Topological Data Analysis Toolkit for Machine Learning a network is therefore an objects with a more homogeneous body and Data Exploration. arXiv:2004.02551 [cs.LG] might be more connected than the ones with more articulation [13] Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E Sarma, Michael M Bronstein, and Justin M Solomon. 2019. Dynamic graph cnn for learning on point clouds. Acm parts. The number of louvain communities could also be a good Transactions On Graphics (tog) 38, 5 (2019), 1–12. measure to determine out of how many parts the object consists [14] Juan Zhang, Kaleem Siddiqi, Diego Macrini, Ali Shokoufandeh, and Sven Dickin- son. 2005. Retrieving Articulated 3-D Models Using Medial Surfaces and Their since it also appears in many of the best subsets for all use cases. Graph Spectra. 285–300. https://doi.org/10.1007/11585978_19 NA is one of extreme examples that perform very well on two classes but quite worse on the others. Number of articulation points is greatly dependant on how many nodes there are in general and while in some cases it might be a very good indicator of how many articulation parts there are in others it might be deceiving as for example a long leg of a spider might have multiple articulation points. Such features (DAC is another such example) might also be taken into consideration when building a model however it is not guaranteed that they will perform well on all use cases. 4 CONCLUSION Our experiments have shown that it is bene�cial for a model for 3D object classi�cation to use (in addition to topology features) network analysis features. However, it is di�cult to determine which will perform best for a certain use case. We have tested our models on four di�erent classi�cation problems and the results have shown that while some attributes are indeed in general better than the others, for the majority of objects it is hard to determine which combination will perform best. In future work we could expand our research by including a larger dataset (include more objects). We would also like to test more network features, and get an even better insight into what are the most bene�cial attributes. REFERENCES [1] 2022. Polygon Mesh. https://en.wikipedia.org/wiki/Polygon_mesh [2] David Cournapeau. 2022. SciKit Learn: Machine learning in Python. [3] Kan Guo, Dongqing Zou, and Xiaowu Chen. 2016. 3D Mesh Labeling via Deep Convolutional Neural Networks. ACM Trans. Graph. 35, 1, Article 3 (dec 2016), 4 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 E�icient machine learning based graph layout calculation for investigation platform Alen Granda Niko Lukač alen.granda@student.um.si niko.lukac@um.si Faculty of Electrical Faculty of Electrical Engineering and Computer Science, Engineering and Computer Science, University of Maribor University of Maribor Koroška cesta 46 Koroška cesta 46 SI-2000 Maribor, Slovenia SI-2000 Maribor, Slovenia Aleksander Pur Štefan Kohek aleksander.pur@policija.si stefan.kohek@um.si Ministry of the Interior Faculty of Electrical Štefanova ulica 2 Engineering and Computer Science, SI-1000 Ljubljana, Slovenia University of Maribor Koroška cesta 46 SI-2000 Maribor, Slovenia Abstract For graph visualization, most common approach in web appli- Modern web technologies enable interactive visualization of graphs cations is D3.js library [2], which uses Dwyer’s proposition of in a web browser. However, larger graphs, which are common in incremental graph layout calculation. It computes incremental lay- investigation data, bring numerous technical limitations in terms out with complexity of $(|+ |log|+ | + |⇢|) per iteration where V of transfer bandwidth and application responsiveness. In this paper is a set of vertices and E a set of edges [3]. This algorithm is com- we propose a machine learning based method to e�ciently transfer putationally demanding, which is noticeable especially at the side graphs’ data between the client and the server. Graphs are stored of low-power clients (e.g. mobile phones). Therefore, web applica- in the investigation platform database on the server and are trans- tion may su�er from unresponsiveness. Another di�culty is also ferred to web browser on the client for interactive visualization lower bandwidth in comparison to local desktop applications. To and manipulation. For this reason, we utilize a concept called graph construct a layout, algorithm expects complete graph data to be embedding. The main aim is to o�er responsive web application available, which prevents incremental visualization of graphs and due to less complex layout calculation algorithm on the client, less therefore responsiveness of the web application. By application of bandwidth requirements and incremental visualization of nodes our proposition, aforementioned problems can be alleviated due to during graph transfer. We performed distinct experiments, which simultaneous data transfer and graph construction. demonstrate faster graph visualization on the client. We perceived This problem has been extensively studied in the literature in up to 130 times faster layout calculation on the client compared to terms of time and space complexity. Gove [6] recently presented ran- standard force graph algorithms with maximum 1000 nodes, while dom vertex sampling algorithm running at 3 $ (|+ |) time and $ (|+ | 4 ) preserving adequate visualization accuracy. auxilliary space. With combination of parallelization, speedup ratio Keywords of 26.7% may be achieved [17]. In addition, by using specialized machine learning, graph embedding, graph visualization, investiga- method for graph representation termed Log (graph) [1], high com- tion platform, web application pression ratios with minimal cost decompression can be realized. In this paper we propose machine learning-based graph lay- out calculation for web applications. Given nodes and edges, the server application trains neural network based on expected node 1 Introduction coordinates calculated in advance. For the visualization in the web Data representation using graphs is nowadays widely adopted stan- application on the client, only the weights of the neural network, dard at any form of applications [4]. Graphs provide natural and graph nodes and graph edges are transferred from the server. The intuitive method of presenting enormous amounts of research data client is provided with the same neural network as the server. After (e.g. bank transactions between entities, relations on social me- having received the weights, the client is capable of incrementally dia), which daily increase in size. In this paper, the main focus is constructing the layout by simultaneously receiving and predicting on modern web applications enriched with graphs, which are the received node’s coordinates, resulting in improved responsiveness basis of research platforms. Example of such application is an ad- compared to the standard layout calculation algorithms, which vanced investigation platform [13], which facilitates investigations require complete graphs to be available in advance. by visualizing nodes with their relations using graphs. We extend the platform by utilizing suggested approach to facilitate several drawbacks, which are discussed in succeeding paragraph. h�ps://doi.org/10.51939/scores22.02 5 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Alen Granda, Niko Lukač, Aleksander Pur, and Štefan Kohek The experiments were performed on an advanced investigation coordinate in the �nal graph visualization. Nodes in the hidden platform data [13]. Results were compared to baseline visualiza- layer are provided with the hyperbolic tangent activation func- tion algorithms and assessed using graphs of accuracy and time tion, whereas the output layer possesses simple linear activation complexity. We indicated lower bandwidth requirements with ac- function. ceptable accuracy. In addition, faster graph construction on the As an input data, MLP accepts =-dimensional array of node em- client-side was demonstrated. bedding, where = denotes the size of an embedding. The main aim Below, the main contributions of the paper are outlined. of a node embedding is to transform a node into a vector space • E�cient procedure of conveying graph data from the server while preserving its properties [16]. Consequently, it retains node to the client is provided. relatedness, resulting in clustering coupled nodes, while distancing • Graph visualization work�ow is designed, which breaks unrelated nodes. Let 28 denote the 8-th node in the graph. To con- dependency between the nodes for the layout calculation struct its embedding, random walk-based node embedding node2vec and therefore enables incremental visualization of graphs. [8] has been chosen. Random walk is a procedure of joining similar • A methodology is delivered, which increases responsive- nodes in the complex graph to an embedding. As an input it accepts ness of web applications on the client when visualizing the starting node and produces a vector of speci�ed length. Each enormous amounts of nodes and connections because of element in the vector represents a characteristic of the visited node. reduced visualizaton algorithm complexity. With the help of extra parameters, which guide the walk, node2vec • An approach of node coordinate prediction based on the switches between breadth-�rst search (BFS) and depth-�rst search node embedding method called random walk is introduced. (DFS) to simulate the random walk. Correspondingly, it traverses 2 Layout construction local clusters as well as distant ones. Before training the MLP, expected coordinates are computed Proposed algorithm takes advantage of machine learning (ML) to for every node using an implementation of Fruchterman-Reingold predict node coordinates in the graph. However, graphs are known force-directed algorithm [5] from NetworkX Python library [9]. As a to hardly incorporate with ML algorithms due to their diversity learning algorithm backpropagation [11] in combination with Adam [10]. As a consequence, they have to be transformed to an input optimizer [12] is consumed. After having optimized the weights, whose neural network is familiar with. For this reason, a tech- the server sends them back to the client accompanied by the graph nique called graph embedding [7] is utilized. Graph embedding is a nodes and the edges. The client is subsequently responsible for transformation of graphs, which preserves as much graph property graph layout calculation based on the received machine learning information as possible. The main motivation of proposed algo- model and graph data. rithm is to suggest a concept of establishing a machine learning model for layout calculation of nodes on the server and transfer- ring it to the client for execution while preserving original graph appearance. Figure 1 depicts the proposed methodology. The powerful server is responsible for graph feature learning, while the client manages graph visualization. Below, each segment is described in detail. Figure 2: Proposed neural network scheme. 2.2 Graph visualization The client is provided with the neural network from the server. As a consequence, it generates almost identical graph layout after acquiring the MLP weights. For this reason, the only task left for Figure 1: Proposed methodology work�ow scheme. the client is to generate the random walks for an individual re- ceived node and predict its coordinates. After this stage, the client 2.1 Building machine learning model is prepared to visualize the graph. ML model construction happens on the server-side of web appli- 3 Results and discussion cation. The server is responsible to predict node coordinates and The aim of the experiments was to demonstrate improvements to teach the neural network. Figure 2 indicates suggested con�g- in speed compared to the standard visualization algorithms (e.g. uration of neural network based on the multilayer perceptron ar- Fruchterman-Reingold force-directed algorithm [5]) while main- chitecture (MLP) [14] with 2 dense hidden layers, each including @ taining the accuracy of graph visualization. Furthermore, the impact neurons. The output layer consists of 2 neurons, each representing of neural network parameters was inspected by altering the number 6 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 E�icient machine learning based graph layout calculation for investigation platform of neurons @ in the hidden layers and by modifying the number of random walks was manipulated with presupposition of having of random walks A per individual node. Having small number of @ = 100 neurons in each hidden layer of MLP and the length of each random walks might result in many di�erent graph layouts as the random walk = to be 50. Firstly, MLP was trained with 5 random random walks are arbitrarily generated. walks per node and RMSE was calculated. Afterwards, 5 random Experiments were examined on hardware with following spec- walks were added and procedure was repeated. Iteration stopped i�cations: CPU: Intel Core i7-8700, RAM: 16 GB, GPU: GeForce when the number of random walks reached 100. Presented process GTX 1070. In addition, Python programming language was selected was rerun 10 times for graphs having 10, 50, 100 and 1000 nodes. with TensorFlow [15] library of version 2.4.0 for neural network Figure 5 manifests average RMSE. It is noticeable how accuracy construction and NetworkX [9] library of version 2.5 to execute increased as more random walks were included for each node. force-directed algorithm. Correspondingly to Figure 4, exactness started converging to some Input data was acquired from an advanced investigation plat- value as the number of random walks exceeded 60. form [13]. Graphs are directed, containing from 10 to 1000 nodes. Figure 3 depicts an example of a graph with 12 related nodes. Before learning the model, the training set was prepared using output co- ordinates of Fruchterman-Reingold force-directed algorithm. As a consequence, ground truth for results is the latter algorithm, which was executed using NetworkX library. The root mean squared er- ror (RMSE) metric were applied to analyze the correctness of the predicted node coordinates. RMSE indicates how distant are node coordinates between the suggested algorithm and node coordinates from the training set. Individual node coordinate was presented as �oating-point numbers ranging between -10 and 10 units. Training period was limited to 100 epochs and learning rate was set to 0.001. Three experiments were performed. The aim of the �rst was to assess an accuracy of proposed algorithm against the number of neurons @ in the hidden layers of the MLP neural network. The number of random walks A for every node was arranged to 50, each Figure 4: Graphs of accuracy against number of neurons in having the length = = 10. Hence, the size of the input layer was hidden layer of MLP. �xed to 10. Such con�guration has appeared as a good compromise between e�ciency and time consumption when running experi- ments. Beginning with 10 neurons in each hidden layer, number of neurons was increased by 10 until 200. MLP learned the weights and RMSE was calculated in every iteration. Procedure was repeated 10 times for graphs with 10, 50, 100 and 1000 nodes. Figure 4 indicates average RMSE of all recurrences. Improvement of accuracy might be observed as the number of neurons increased, regardless of the initial graph size. However, starting with 100 neurons per hidden layer neural network exhibited no signi�cant enhancement. Figure 5: Graphs of accuracy against number of walks per node. Finally, runtimes were examined. We experimented time con- sumption for node coordinate prediction with given neural network on the client for graphs of distinct size and discovered nearly linear increment with the number of nodes. Compared to an implemen- tation of Fruchterman-Reingold algorithm in NetworkX library, Figure 6 indicates remarkable time-consumption reduction. In com- Figure 3: Example layout construction of proposed method bination with the parallel layout construction while receiving the against Fruchterman-Reingold algorithm. graph data, a considerable potential for minimizing time necessary to compute coordinates is perceived. Standard force-directed algo- The focus of the second experiment was to evaluate precision rithms namely construct node coordinates with a time complexity of proposed algorithm against the number of random walks A per of $ node. Strategy was related to the initial experiment. The number (n2) per individual iteration of the simulation, which might 7 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Alen Granda, Niko Lukač, Aleksander Pur, and Štefan Kohek be time-consuming, especially for graphs extending 400 nodes as methodology could be improved by selecting other neural network shown in Figure 6. Considering that the execution of algorithms oc- architectures. Having inspected the experiments, we have perceived curs on devices with lower hardware speci�cations, the di�erences how RMSE increased with graph size. Nevertheless, results have in time consumption should escalate. improved as the complexity of MLP increased. For this reason, we propose future research to focus on selecting a more complex neural network for given task. Aditionally, we suggest: • Testing an impact of node2vec random walk parameters on accuracy, • Sensitivity of the results to the length of node embeddings, • Testing other graph embedding techniques, • An evaluation of server performance when receiving mul- tiple requests from clients and an evaluation of distributed computation. Acknowledgments Figure 6: Time consumption for node coordinate calculation. The authors acknowledge joint �nancial support from the Slovenian Research Agency and Slovenian Ministry of the Interior (target As mentioned, the purpose of the presented methodology was research programme No. V2-2117). also to decrease bandwidth requirements. Let us presume our propo- sition of MLP architecture with @ = 100 neurons in each hidden References layer and an input layer of size = = 10. Hence, neural network [1] Maciej Besta, Dimitri Stanojevic, Tijana Zivic, Jagpreet Singh, Maurice Hoerold, consists of 11200 weights. On condition that each weight is pre- and Torsten Hoe�er. 2018. Log (graph) a near-optimal high-performance graph representation. In Proceedings of the 27th international conference on parallel sented as 32-bit �oating point, the amount of data required to be architectures and compilation techniques. 1–13. transferred between the server and the client is 44800 bytes, which [2] Michael Bostock, Vadim Ogievetsky, and Je�rey Heer. 2011. D3 data-driven is considerably compact regarding today’s technology. After the documents. IEEE transactions on visualization and computer graphics 17, 12 (2011), 2301–2309. weights are sent, the client is able to gradually construct the layout [3] Tim Dwyer. 2009. Scalable, versatile and simple constrained graph layout. In while receiving the nodes and the edges. Computer graphics forum, Vol. 28. Wiley Online Library, 991–998. A perspective to expose is the resistance of the proposed algo- [4] Nadime Francis, Alastair Green, Paolo Guagliardo, Leonid Libkin, Tobias Lin- daaker, Victor Marsault, Stefan Plantikow, Mats Rydberg, Petra Selmer, and rithm to the number of edges in the graph. Being the graph fully Andrés Taylor. 2018. Cypher: An evolving query language for property graphs. connected or not, the algorithm requires to generate A random In Proceedings of the 2018 International Conference on Management of Data. 1433– 1445. walks of length = for every node. Random walk construction of a [5] Thomas M. J. Fruchterman and Edward M. Reingold. 1991. Graph drawing node without connections would result in a vector containing the by force-directed placement. Software: Practice and experience 21, 11 (1991), node itself 1129–1164. =-times. [6] Robert Gove. 2019. A Random Sampling O (n) Force-calculation Algorithm 4 Conclusion for Graph Layouts. In Computer Graphics Forum, Vol. 38. Wiley Online Library, 739–751. A method for graph visualization in modern web applications using [7] Palash Goyal and Emilio Ferrara. 2018. Graph embedding techniques, applica- MLP neural network has been presented. Based on the experiments, tions, and performance: A survey. Knowledge-Based Systems 151 (2018), 78–94. [8] Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for time consumption of proposed procedure is promising compared networks. In Proceedings of the 22nd ACM SIGKDD international conference on to the Fruchterman-Reingold algorithm. In addition, visualization Knowledge discovery and data mining. 855–864. accuracy has been preserved. For this reason, the experiments have [9] Aric A. Hagberg, Daniel A. Schult, and Pieter J. Swart. 2008. Exploring Network Structure, Dynamics, and Function using NetworkX. In Proceedings of the 7th manifested that the purpose of proposed algorithm is ful�lled. Python in Science Conference, Gaël Varoquaux, Travis Vaught, and Jarrod Millman One of the concerns might be managing multiple big data re- (Eds.). Pasadena, CA USA, 11 – 15. [10] William L. Hamilton, Rex Ying, and Jure Leskovec. 2017. Representation Learning quests from clients. However, as modern server infrastructures are on Graphs: Methods and Applications. IEEE Data Eng. Bull. 40, 3 (2017), 52–74. getting more powerful each day, we believe our concept has practi- http://sites.computer.org/debull/A17sept/p52.pdf cal value. Instead of executing complex calculations on the client [11] Henry J. Kelley. 1960. Gradient theory of optimal �ight paths. Ars Journal 30, 10 (1960), 947–954. and transferring huge amounts of data between the server and the [12] Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Opti- client, we conceive our method as a replacement to aforementioned mization.. In ICLR (Poster), Yoshua Bengio and Yann LeCun (Eds.). http://dblp.uni- techniques. In addition, proposed methodology is able to construct trier.de/db/conf/iclr/iclr2015.html#KingmaB14 [13] Niko Lukač, Borut Žalik, Matej Brumen, David Jesenko, Štefan Kohek, Andrej layout incrementally at the time of receiving the nodes from the Nerat, and Marko Bizjak. 2019. Zasnova napredne preiskovalne platforme (NNP): server, resulting in improved user experience. The only limitation zaključno poročilo. UM FERI. [14] Leonardo Noriega. 2005. Multilayer perceptron tutorial. is the requirement for transferring edges in an appropriate order School of Computing. Sta�ordshire University (2005). for a random walk on the client. The problem can be alleviated by [15] TensorFlow Developers. 2022. TensorFlow. https://doi.org/10.5281/ZENODO. limiting the length of the walk. Moreover, the server could take 4724125 Accessed: 2022-06-19. [16] Mengjia Xu. 2021. Understanding graph embedding methods and their applica- advantage of caching to avoid neural network learning process tions. SIAM Rev. 63, 4 (2021), 825–853. duplication. [17] Abenov Zhansultan, Aubakirov Sanzhar, and Paulo Trigo. 2021. Parallel imple- In this paper, the main focus was on node coordinate predic- mentation of force algorithms for graph visualization. Journal of Theoretical and Applied Information Technology 99, 2 (2021), 503–515. tion. Nonetheless, our contribution might be reproduced to predict other graph parameters (e.g. graph coloring). In addition, presented 8 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Music genre classification based on spectrograms of the sound recording using an ensemble of CNNs Tadej Lahovnik Vili Podgorelec tadej.lahovnik@student.um.si vili.podgorelec@um.si Faculty of Electrical Faculty of Electrical Engineering and Computer Science, Engineering and Computer Science, University of Maribor University of Maribor Koroška cesta 46 Koroška cesta 46 SI-2000 Maribor, Slovenia SI-2000 Maribor, Slovenia ABSTRACT any pattern is present in the signal, and how a particular signal is Several papers have attempted to classify music genres based on correlated to other similar signals [16]. The extracted audio signal the features extracted from sound recordings. However, none have features represent training data for a selected ML algorithm. implemented an ensemble classi�er of di�erent CNNs for various While such ML approaches have shown some excellent results, types of spectrograms. there are other ways of representing and processing audio signals. One thousand sound recordings from the GTZAN database were An audio signal can be visually represented with spectrograms, used for classi�cation by the authors. Each sound recording was potentially revealing patterns characteristic of di�erent types of converted into three di�erent spectrogram types, resulting in 3000 music. Existing works [3, 12] have already approached the task spectrograms. 85% of the spectrograms were used to train three using spectrograms or features extracted from sound recordings. CNN models, and the remaining 15% were used for testing. The Di�erent spectrogram types can be used to represent an audio sig- individual CNN models formed a classi�er ensemble, which com- nal. For example, Mel spectrograms1,2 can be used instead of typical bined the predictions of respective models into a single prediction spectrograms [3, 9, 17]. As spectrograms are visual representations based on the sum of the scores of respective genres. of music, the most advanced deep neural network-based image clas- Since the accuracy of the classi�er ensemble (54.67%) is higher si�cation techniques can be applied to music genre classi�cation. than the accuracy of the individual classi�cation models (44.00%, In this paper, we propose a new ensemble classi�cation method 53.33%, 26.67%), it was bene�cial to combine the CNN models into that predicts the music genre of a sound recording based on sev- one. The confusion matrix revealed some common errors in genre eral individual convolutional neural network (CNN) models, each prediction. The somewhat low accuracy is likely a consequence of trained on its type of spectrogram. To the best of our knowledge, the truncated sound recordings. Although the classi�er ensemble this is the �rst report on music genre classi�cation using an en- did not achieve high accuracy, it predicted the genre based on the semble classi�er combining di�erent types of spectrograms. An spectrograms of the sound recording more accurately than a human. essential advantage of the proposed method is the direct use of Weighting the individual CNN models could signi�cantly improve spectrogram images for training the CNN models, which does not the results. require the often demanding process of extracting and selecting features from sound recordings. KEYWORDS classi�cation, spectrogram, machine learning, convolutional neural 2 BACKGROUND network, music genre, sound recording 2.1 Music genre 1 INTRODUCTION A music genre is a category of music characterised by a partic- During the last two decades, we have seen an increase in AI-driven ular style. A genre can also be in�uenced by social conventions, recommendation systems on audio streaming platforms. There are marketing, association with a particular artist, and other external two general approaches to music recommendation - collaborative in�uences [2]. and content-based recommendation. While the former recommends Repetition is the foundation of genres. A genre codi�es past objects that the user group of similar preference has liked, the latter repetitions and encourages new repetitions [14]. analyses the content of objects that a user has previously preferred In this paper, genres represent classes for classi�cation. A class is and recommends the ones with relevant content [8]. For content- a set of things that can be grouped meaningfully. We often think of based music recommendation, automatic music genre identi�cation a class in terms of the common properties of its members, especially plays an important role. those that distinguish them from other things that are similar in Various machine learning (ML) methods have been developed many ways [13]. for accurate music genre classi�cation [7]. The algorithms behind these systems are often based on metadata and features extracted from sound recordings using audio processing techniques. Audio 1 signal processing algorithms generally involve analysis of a signal, a method of representing audio visually [17] 2substitutes the frequency on the y-axis with the mel scale and indicates colours using extracting its properties, predicting its behaviour, recognising if the decibel scale instead of the amplitude [6] h�ps://doi.org/10.51939/scores22.03 9 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Tadej Lahovnik and Vili Podgorelec 2.2 Spectrograms A spectrogram represents a signal’s intensity or ’loudness’ over time at the di�erent frequencies present in a particular waveform. Changes in energy levels over time are displayed in a spectrogram. [1].Spectrogramsaretwo-dimensionalgraphs,wherecoloursrepre- sent the third dimension [1]. Time is represented on the horizontal axis. The vertical axis represents frequency, which can also be in- terpreted as pitch or tone. The lowest frequencies are located at the bottom and the highest at the top. 2.3 Spectrogram generation process The Librosa3 library allows us to generate a simple spectrogram from a sound recording. The sound recording is converted into a �oating point time series during the upload process. The resulting Figure 2: Spectrogram (tone/time) time series must be converted from the square of the amplitude to decibels before the spectrogram can be displayed. Figure 1 shows an example of the �rst type of spectrograms we used for classi�cation. The spectrogram shows the presence of speci�c frequencies over time. Orange represents a high presence of a particular frequency, and blue represents a low presence. Figure 3: Spectrogram (chroma/time) 2.4 Classi�cation Classi�cation occurs in many human activities. When used in its broadest sense, the term can refer to any situation in which a predic- tion or decision is made based on currently available information Figure 1: Spectrogram (frequency/time) [10]. Numerous classi�cation algorithms exist, including decision trees, rule-based learning, support vectors, Bayesian networks, and convolutional neural networks (CNNs). If necessary, classi�cation Figure 2 shows an example of the second type of spectrograms, algorithms may also be combined into ensembles (e.g., boosting, which shows the presence of certain tones over time. Orange colour bagging, stacking, tree forests, and more) [12]. represents a high presence of a particular tone, and blue represents a low presence. The markings on the y-axis indicate the individual 2.4.1 Neural network. A neural network is a set of algorithms that octaves (C1-C9). seek to identify the underlying connections in a set of data through Figure 3 shows an example of the third type of spectrograms, a process that mimics the human brain [5]. which shows the presence of tones across all octaves over time. Neural networks have three main components: an input layer, a Orange represents a high presence of a particular tone, and blue processing or hidden layer, and an output layer [5]. Inputs can be represents a low presence. Individual marking on the y-axis includes weighted based on di�erent criteria. all matching tones from di�erent octaves. In neural networks, learning is accomplished by altering the weights across connections in response to new input data or learn- ing patterns [15]. A convolutional neural network is adapted to analyse and recog- 3https://librosa.org/doc/latest/index.html nise visual data such as digital images or photographs [5]. 10 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Music genre classification based on spectrograms of the sound recording using an ensemble of CNNs 3 THE PROPOSED METHOD minimise the loss function of the CNN model. The CNN model out- The GTZAN database [11], which contains 1000 sound recordings, puts a vector with the same length as the number of classes (music was used for the development. Each sound recording is 30 seconds genres). This vector represents the scores of individual classes - the long and belongs to one of the ten genres4 provided by the database. higher the score for a particular class, the higher the expectation The sound recordings were later converted into spectrograms using that the recording belongs to the corresponding music genre. Since the Librosa library. these values can be arbitrary, we used a function called Softmax, Three di�erent types of spectrograms were created for each which ensures that the sum of all the values is 1, thus constraining sound recording in the dataset. 85% of the spectrograms were used the individual scores between 0 and 1 [4]. to train the CNN models, and the remaining 15% were used for 3.1.3 Combining CNN models into an ensemble. After each CNN testing. The training set was partitioned into training and validation model is trained on its type of spectrogram from sound recordings, sets using the Keras5 library. The validation set consisted of 30% of the individual CNN classi�cations are combined into an ensem- the training data. ble, comprising the classi�cation results of all three speci�c CNN The generated images were classi�ed using CNNs provided by models. The instances of three speci�c CNN models have been the Sklearn6 library. Three separate CNN models were created, combined in a single ensemble model by averaging the outputs of trained, and combined into a classi�er ensemble. The predictions the corresponding Softmax layers. Figure 5 showcases the proposed of the individual models were combined into a single prediction ensemble method. based on the sum of the predictions. The classi�er ensemble was evaluated with several metrics7. Additionally, we displayed the results with a confusion matrix, revealing common errors of the implemented classi�er ensemble. 3.1 The ensemble of CNN models 3.1.1 The CNN model. For classi�cation, we used the sequential model from the Keras8 library. Figure 4 shows the visualisation of the model. Figure 5: The proposed ensemble classi�er Figure 4: CNN model 4 RESULTS The Rescaling layer standardises the input data. Conv2D creates Three di�erent CNN classi�cation models were used to perform a 5x5 convolution kernel that is convolved with the layer input the classi�cation. We used 150 test instances or 15% of the total to produce a tensor of outputs. MaxPooling2D downsamples the dataset for prediction. input along its spatial dimensions (height and width) using a 2x2 The �rst classi�cation model (which predicted the genre based pooling window. The Flatten layer �attens the input. The Dense on classical spectrograms) correctly classi�ed 66 test instances. The layer implements an operation that returns a vector with a length accuracy of the �rst classi�cation model was 44.00%. equal to the number of classes, providing the classi�cation scores The second classi�cation model (which predicted the genre based for each respective class (music genre in our case). on spectrograms showing the presence of certain tones over time) correctly classi�ed 80 test samples. The accuracy of the second 3.1.2 Training a CNN model. Each CNN classi�cation model has classi�cation model was 53.33%. been trained separately on its type of spectrogram images (each The third classi�cation model (which predicted genre based on CNN model is trained from scratch with random initialisation of spectrograms showing the presence of individual tones across all weights). Fifty epochs were used to train the model. Each batch octaves over time) correctly classi�ed merely 40 test samples. The contained 128 samples. Stochastic gradient descent was used to accuracy of the third classi�cation model was 26.67%. 4blues, classical, country, disco, hiphop, jazz, metal, pop, reggae, rock After combining the three individual CNN classi�ers into an en- 5https://keras.io semble, the ensemble classi�er correctly predicted 82 test instances. 6https://scikit-learn.org/stable 7accuracy, recall, precision, F-score The accuracy of the ensemble classi�er is 54.67%, which is higher 8https://keras.io/guides/sequential_model than the accuracy of each CNN classi�cation model. In this manner, 11 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Tadej Lahovnik and Vili Podgorelec merging the individual CNN classi�cation models into an ensemble A single track may belong to several di�erent genres. Therefore, classi�er was worthwhile. performing a multi-label classi�cation and comparing the results Figure 6 shows the �nal confusion matrix of the proposed en- would be reasonable. Alternatively, the multi-label classi�cation semble classi�er. The labels of the individual rows show the actual model could be used within the classi�er ensemble. genres of the test instances, while column labels show the genre Features could be extracted from the sound recordings, and an- predicted by the classi�cation ensemble. Values in the cells show other classi�cation model could be added to the classi�er ensemble. the number of test instances belonging to the genre shown in the The classi�cation would likely be more accurate as these features row, classi�ed as the genre shown in the column. In addition to contain additional information. the numerical labels, we can use the colour scale found next to the It might also be worth considering weighing the individual clas- confusion matrix. si�ers in the ensemble. As a result, each classi�er would not neces- The confusion matrix shows that there are some common errors sarily contribute the same amount to the �nal ensemble prediction. in genre prediction. The ensemble often predicted blues as rock, country as jazz and rock, and disco as hip hop, pop, and rock. The ACKNOWLEDGMENTS most incorrect predictions occurred for the disco genre. On the The authors acknowledge the �nancial support from the Slovenian other hand, classical music seems to be the easiest to distinguish Research Agency (research core funding No. P2-0057). from other genres, as it was only misclassi�ed on a few occasions with jazz. REFERENCES [1] [n. d.]. Spectrogram. https://en.wikipedia.org/wiki/Spectrogram Accessed: 2022-07-28. [2] Shlomo Argamon, Kevin Burns, Shlomo Dubnov, and Roger B Dannenberg. 2010. Preprint from The Structure of Style: Algorithmic Approaches to Understanding Manner and Meaning Style in Music. , 45-58 pages. http://www.cs.cmu.edu/~rbd/ [3] Hareesh Bahuleyan. 2018. Music Genre Classi�cation using Machine Learning Techniques. (4 2018). http://arxiv.org/abs/1804.01149 [4] Hmrishav Bandyopadhyay. 2022. Image Classi�cation Explained. https://www. v7labs.com/blog/image-classi�cation-guide [5] James Chen. 2021. Neural Network. https://www.investopedia.com/terms/n/ neuralnetwork.asp [6] Ketan Doshi. 2021. Audio Deep Learning Made Simple - Why Mel Spectrograms perform better. https://ketanhdoshi.github.io/Audio-Mel/ [7] Ahmet Elbir, Hilmi Bilal Çam, Mehmet Emre Iyican, Berkay Öztürk, and Niza- mettin Aydin. 2018. Music genre classi�cation and recommendation by using machine learning techniques. In 2018 Innovations in Intelligent Systems and Ap- plications Conference (ASYU). IEEE, 1–5. [8] Dongmoon Kim, Kun-su Kim, Kyo-Hyun Park, Jee-Hyong Lee, and Keon Myung Lee. 2007. A music recommendation system with a dynamic k-means clustering algorithm. In Sixth international conference on machine learning and applications (ICMLA 2007). IEEE, 399–403. [9] Jash Mehta, Deep Gandhi, Govind Thakur, and Pratik Kanani. 2021. Music Genre Classi�cation using Transfer Learning on log-based MEL Spectrogram. Proceed- ings - 5th International Conference on Computing Methodologies and Communica- tion, ICCMC 2021, 1101–1107. https://doi.org/10.1109/ICCMC51019.2021.9418035 [10] Donald Michie, David J Spiegelhalter, and Charles C Taylor. 1994. Machine learning, neural and statistical classi�cation. (1994). [11] Andrada Olteanu. 2020. GTZAN Dataset - Music Genre Classi�ca- tion. https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music- Figure 6: Confusion matrix genre-classi�cation [12] Nikki Pelchat and Craig M. Gelowitz. 2020. Neural Network Music Genre Classi- �cation. Canadian Journal of Electrical and Computer Engineering 43 (6 2020), 170–173. Issue 3. https://doi.org/10.1109/CJECE.2020.2970144 5 CONCLUSIONS [13] Claude Sammut and Geo�rey I Webb. 2011. Encyclopedia of machine learn- ing. Springer Science & Business Media. https://books.google.si/books?id= Although the achieved classi�cation accuracy (54.67%) does not i8hQhp1a62UC [14] Jim Samson. 2001. Genre. Vol. 1. Oxford University Press. https://doi.org/10. seem to be very high, we have to consider that it is di�cult to 1093/gmo/9781561592630.article.40599 distinguish between 10 di�erent music genres, even for a human. [15] Yi Shang and Benjamin W Wah. 1996. Global optimization for neural network Thus, we are satis�ed with the result. However, there is still plenty training. Computer 29 (1996), 45–54. Issue 3. [16] Garima Sharma, Kartikeyan Umapathy, and Sridhar Krishnan. 2020. Trends in of room for improvement. audio signal feature extraction methods. Applied Acoustics 158 (2020), 107020. Librosa produces wide white margins around the spectrogram, [17] Sugianto Sugianto and Suyanto Suyanto. 2019. Voting-based music genre classi�- cation using melspectogram and convolutional neural network. which are useless for classi�cation. The spectrograms could be 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), cropped to ensure that only the spectrogram is present in each 330–333. image. Classi�cation of a music genre based on spectrograms is not the most accurate. The low accuracy is most likely also due to the truncated music �les. Each sound recording is only 30 seconds long. If the dataset contained full tracks, we would have a more representative sample, which would most likely improve the results. 12 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Super-Resolution Method for Reconstructing Street Images from Surveillance System based on Real-ESRGAN Nguyen Quoc Toan quoctoann3@gmail.com Department of Electronic and Electrical Engineering, Hongik University, 94 Wausan-ro, Mapo-gu, Seoul, Republic of Korea ABSTRACT obtain a great harmony of simplicity and e�ectiveness. A recent pa- In many cities around the world, large sums of money are invested per [14] assumes a random shu�ing strategy for synthesizing more in surveillance camera systems, but few optimize the bene�ts and practical degradations. But even so, it still includes a set amount costs of those investments, and thus the overall impact of surveil- of degradation processes, and it is unclear whether all the shu�ed lance cameras on crime rates. In this paper, based on a technique degradations are useful. High-order degradation modeling, on the named Real-ESRGAN applied to a practical restoration application other hand, is more adaptable and aims to imitate the real degra- that has been enhanced by the e�cient ESRGAN. It is a super- dation generation process, B8=2 �lters in the synthesis procedure resolution method that was developed in blind super-resolution are employed to simulate ringing and over-shoot artifacts. Because to reinstate low-resolution street images with unknown and com- the degradation space can be much bigger than ESRGAN, training plicated degradations. It can be applied for security purposes in becomes tough. First, the discriminator must be more powerful to surveillance systems. Since video surveillance systems typically distinguish realness from complicated training output. Secondly, the capture low-resolution images in many areas, the detection and discriminator’s gradient feedback should be more precise for local identi�cation of objects are sometimes required. This task’s super- information improvement. In Real-ESRGAN, a VGG-style discrimi- resolution is tough because image appearances vary depending on nator was upgraded to a U-Net design [7][8][13]. Thirdly, the U-Net a variety of factors. The low resolution combined with poor op- architecture and complex degradations increase training instability. tics is completely insu�cient for identifying the subject of interest To balance the training dynamics, spectral normalization (SN) reg- on the street, from a distance, in bad weather, or under any other ularization [6][8] was used. It is simple to train Real-ESRGAN and limitations. Furthermore, to strengthen discriminator capability obtain stability of local detail advancement and artifact suppression and create stable training dynamics, the U-Net discriminator was with the dedicated improvements. employed with spectral normalization. Hence, when compared to other experimental techniques, it can be demonstrated that this 2 REAL-ESRGAN method delivers the best result. Experiment results show that super- resolution recovery of street images taken from a surveillance sys- 2.1 Classical Degradation Model tem is attainable with the following results: PSNR: 30.36dB and Blind SR recovering high-resolution images from low-resolution SSIM: 0.86. images that have undergone unknown and intricate degradations. Based on the underlying degradation process, existing techniques KEYWORDS can be divided into two categories: explicit modeling and implicit modeling. The classic degradation model [1][4] is widely used in computer vision, super-resolution, GAN, U-Net, Real-ESRGAN explicit approaches [2][3][15] and includes blur, downsampling, noise, and JPEG compression. To generate the low-resolution input, 1 INTRODUCTION the classical degradation architecture [1][4] is commonly utilized. Image resolution is a signi�cant factor in calculating image quality. In general, the ground-truth image ~ can be convolved with the The better the resolution, the more detailed the information in the blur kernel : �rst. Afterward, with scale factor A, a downsampling image, making it more robust for objects on street recognition tasks. operation is applied. Adding noise = yields the low-resolution G. Improving image resolution has always been an unstoppable pursuit Lastly, JPEG compression is used because it is ubiquitous in real- of industry and academia. Real-ESRGAN [11] has been applied due world images. In general, D represents the process of degradation. to its signi�cant improvements compared to other experimental methods. The ESRGAN [12] was extended by Real-ESRGAN, to G = ⇡ (~) = [(~ ⇤ :) #A +=] %⇢⌧, (1) restore general real-world LR images by combining training sets with a more practical degradation process. • Blur. Gaussian �lters, both isotropic and anisotropic, are se- Simply put, Real-ESRGAN extends the classical "�rst-order" lected. Although Gaussian blur kernels are typically utilized degradation model to "high-order" degradation modeling tech- to model blur degradation, they may still not accurately niques, i.e., the degradations are modeled with multiple repeated represent real camera blur. Gaussian blur kernels [5] and a degradation processes, each of which is the classical degradation plateau-shaped allocation implement more diverse kernel model. A second-order degradation process is used empirically to shapes are generalized as well. h�ps://doi.org/10.51939/scores22.04 13 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Nguyen �oc Toan Figure 1: Illustration of High-order data generation Real-ESRGAN • Noise. The additive Gaussian and Poisson noise types are The classical degradation model only involves a limited number of applied. The probability density function of additive Gauss- fundamental degradations, which can be thought of as �rst-order ian noise is the same as the Gaussian distribution. The noise modeling. Notwithstanding, real-world degradation is quite vari- intensity is governed by the Gaussian distribution’s stan- ous and typically consists of a series of mergers such as imaging dard deviation. Color noise occurs when unbiased sampled systems of cameras, image collection quality from video, and so noise is present for each channel of RGB images. The Pois- on. In particular, the original image could be a very limited pixel son distribution is initiated by Poisson noise. It is frequently image that the surveillance camera can get far away from the set used to estimate sensor noise caused by statistical quantum up system, which inevitably contains degradations such as camera �uctuations, or variations in the photon �ux sensed at a blur, sensor noise, and low resolution. given exposure level. Poisson noise has a value proportional The classical �rst-order model could not model such a complex to image intensity, and noises at the pixel level are self - deterioration process. Therefore, a high-order degradation model reliant. is employed. An =-order model consists of = repeated degradation • Resize (Downsampling). Downsampling is known as the processes (as shown in Eq.2), in which each degradation process pre- resize operation. Nearest-neighbor interpolation, area re- cedes the classical degradation model Eq.1 but with di�erent hyper- size, bilinear interpolation, and bicubic interpolation are all parameters. It should be highlighted that the term "high-order" is resize algorithms. Di�erent resize operations yield di�er- deployed di�erently here than it is in mathematical operations. It ent outcomes, some produce blurry images, while others primarily refers to the time required to complete the same opera- may produce over-sharp ones with overshoot artifacts. To tion. [14] suggests that the random shu�ing strategy encompasses include more diverse and complex resize e�ects, a random repetitive degradation processes (e.g., double blur or JPEG). The resize operation from the options listed above. Nearest- key is the high-order degradation process which indicates that not neighbor interpolation is included because it exposes the all of the shu�ed degradations are intended. To maintain a reason- misalignment issue and only deems the area, bilinear, and able image resolution, the downsampling equation is altered with bicubic operations. a random resize execution. Therefore, a second-order degradation • JPEG compression. It is a prevalent lossy compression process is employed, as it can remedy most real-world problems methodology for digital images. It decodes images to the while remaining simple. The general �ow of data generation stream YCbCr color space �rst, then downsamples the chroma is represented in Fig.1. channels. After that, the features are extracted into 8 x 8 blocks, and each block is converted with a 2D discrete cosine transform (DCT). [10] provides more information G = ⇡= (~) = (⇡= ... ⇡2 ⇡1)(~) (2) on JPEG compression algorithms. JPEG compression fre- quently presents unappealing block artifacts. A quality fac- 2.3 Ringing and overshoot artifact tor @ 2 [0, 100] re�ects the quality of compressed images, Ringing artifacts commonly occur as spurious edges near sharp with a lower q indicating a higher compression ratio and image transitions. They appear as bands near the edges. Overshoot lower quality. artifacts are frequently combined with ringing artifacts, which man- ifest as a higher jump at the edge transition. The primary source of these artifacts is that signal is bandlimited and lacks high frequen- 2.2 High-order Degradation Model cies. These artifacts are very common and are typically caused by When we use the above classical degradation model to generate a sharpening algorithm, JPEG compression, and so on. B8=2 �lter training pairs, the trained model can handle some real-world sam- assists in cutting o� high frequencies and synthesizing ringing and ples. Nevertheless, it is still unable to handle some complex degrada- overshoot artifacts for training sets, its �lters are deployed twice: tions in the real world, particularly unknown noises, and complex during the blurring process and at the end step of the synthesis. artifacts. This is due to the fact that synthesized low-resolution The arrangement of the last B8=2 �lter and JPEG compression is images still have a signi�cant di�erence from realistic degraded randomized transferred to cover a larger degradation space because images. To model more practical degradations, the classical degra- some images may be over-sharpened (with overshoot artifacts) and dation model was extended to a high-order degradation process. then JPEG compressed, while others may be JPEG compressed �rst 14 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Super-Resolution Method for Reconstructing Street Images from Surveillance System based on Real-ESRGAN and then sharpened. The equation of B8=2 is represented in Eq.3, Table 1: Results comparison between Real-ESRGAN, BSR- where (i,j) represents the kernel coordinate, l GAN and Bicubic 2 denotes the cuto� frequency, and 1 means the �rst order Bessel operation of the �rst kind: Method Bicubic BSRGAN Real-ESRGAN l p : (8, 9) = 2 p 82 + 92) (3) 2 PSNR (dB) 25.51 28.09 30.36 c 82 + 92 1(l2 SSIM 0.71 0.76 0.86 2.4 Networks and Training 2.4.1 ESRGAN. Firstly, the same generator (SR network) as ESR- using the mean squared error (MSE). MSE is a statistical concept GAN [12] Fig.2 is used, i.e., a deep network with residual-in-residual that means that an estimator’s mean square error is the mean of the dense blocks (RRDB). In conducting super-resolution with scale fac- squares of the errors, or the di�erence between the estimate and tors of x2 and x1, the extension with x4 ESRGAN architecture was what is evaluated, lower value means better performance. On the represented. Since ESRGAN is a large network, the pixel-unshu�e contrary, the greater the value of SSIM and PSNR, the higher the (an inverse function of pixelshu�e [9]) was used before continuing quality of the reconstructed image followed by Eq.4. . represents to feed inputs into the main ESRGAN architecture to lower spa- the ground truth (reference) and .⇤ describes reconstructed images: tial size and increase channel size. As a result, most calculations are conducted in a smaller resolution space, which relieves GPU 2552 %(# '(., . ⇤) = 10 log10 (4) memory and the computational consumption of resources. "(⇢ " # 2.4.2 U-Net discriminator with spectral normalization (SN). Since 1 ’ ’ Real-ESRGAN tackles a much bigger degradation space than ES- "(⇢ = (. ⇤(8, 9) . (8, 9))2 (5) "# RGAN, the existing discriminator architecture in ESRGAN is no 8=1 9=1 The SSIM evaluation between patches longer appropriate. For complex training outputs, the discrimina- %.⇤ and %. is formulated as: tor in Real-ESRGAN necessitates more discriminative ability. It (2` (( " (% % . ⇤ `% . + 21)(2f. ⇤f% . + 22) (6) must ensure an accurate gradient responses for local textures in . ⇤, %. ) = (`2% + 2 + 2 . ⇤ + `2% . 1)(f2%.⇤ + f2%. 2) addition to discriminating global styles. The VGG-style discrimi- wherè ) and f nator in ESRGAN was enhanced to a U-Net architecture with skip % . ⇤ (`% . % . (f% . ⇤ ) are the knowing and standard deviation of patch % connections, inspired by [8][13]. The U-Net generates realness val- . ⇤ (%. ). 21 and 22 are minor constants. So, the mean score of patch-based SSIM over the image is SSIM (.⇤,.). ues for each pixel that can provide the generator with detailed per-pixel responses. Meanwhile, the U-Net architecture and com- 4 CONCLUSION plicated degradations increase training instability. Regularization of spectral normalization [6] can aid in the stabilization of training In this paper, by applying Real-ESRGAN, a method for reconstruct- dynamics. Furthermore, spectral normalization can help to reduce ing low-resolution street images into recognizable images accept- the oversharpness and annoyance caused by GAN training. Real- able for recognition information tasks. The model achieved out- ESRGAN training can easily reach a better balance of local detail standing results (SSIM:0.86, PSNR:30.36dB). It proved that gener- improvement and artifact suppression with these adjustments. ating degraded real-life scenarios input play an extremely vital role in super-resolution models for street image recognition tasks. 3 EXPERIMENT Real-ESRGAN performs greatly in the removal of artifacts and the restoration of texture details. 3.1 Dataset A brand-new dataset collected by HAIL was used for the experiment. ACKNOWLEDGMENTS It was collected by recording videos on streets in Seoul, South Korea I would like to express my heartfelt appreciation to HAIL (Hongik with a Hanwha Techwin PNO-A9081RLP camera, then frames were University - Arti�cial Intelligence Laboratory, Seoul, Republic of extracted into images. There are 4500 images in total for use. The Korea), which is advised by Professor Seongwon Cho, for facilitating resolution is 4K (4096 x 2160). And 3500 crop images in the test me in carrying out this research. set only included license plates, car brands, and text on the car for model evaluation. Fig.3 is samples from the proposed dataset. REFERENCES [1] Michael Elad and Arie Feuer. 1997. Restoration of a single superresolution 3.2 Result image from several blurred, noisy, and undersampled measured images. IEEE transactions on image processing 6, 12 (1997), 1646–1658. The luminance-based evaluation and comparison criteria, SSIM [2] Jinjin Gu, Hannan Lu, Wangmeng Zuo, and Chao Dong. 2019. Blind super- (Structural Similarity) and PNSR (Peak Signal-to-Noise Ratio) are resolution with iterative kernel correction. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition. 1604–1613. used. To illustrate, two images were used for these evaluations. [3] Yan Huang, Shang Li, Liang Wang, Tieniu Tan, et al. 2020. Unfolding the alter- We refer to them as image 1 and image 2. Image 1 is the original nating optimization for blind super resolution. Advances in Neural Information degraded image from the test dataset, while image 2 is a reconstruc- Processing Systems 33 (2020), 5632–5643. [4] Ce Liu and Deqing Sun. 2013. On Bayesian adaptive video super resolution. IEEE tion of image 1. SSIM is a method for calculating the similarity of transactions on pattern analysis and machine intelligence 36, 2 (2013), 346–360. two images. The SSIM values range from -1 to 1. PSNR is calculated [5] Yu-Qi Liu, Xin Du, Hui-Liang Shen, and Shu-Jie Chen. 2020. Estimating general- ized Gaussian blur kernels for out-of-focus image deblurring. IEEE Transactions to compare the quality of image 1 to image 2 which is calculated by on circuits and systems for video technology 31, 3 (2020), 829–843. 15 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Nguyen �oc Toan Figure 2: ESRGAN generator network. It �rst uses a pixel-unshu�le operation to decrease the spatial size and re-arrange information to the channel dimension for scale factors of x1 and x2. (a) (b) (c) Figure 3: a is sample from the train set, b and c are samples from the test set Figure 4: Reconstructed image result comparing between Real-ESRGAN vs BSRGAN and Bicubic [6] Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. [13] Yitong Yan, Chuangchuang Liu, Changyou Chen, Xianfang Sun, Longcun Jin, Spectral normalization for generative adversarial networks. arXiv preprint Xinyi Peng, and Xiang Zhou. 2021. Fine-grained attention and feature-sharing arXiv:1802.05957 (2018). generative adversarial networks for single image super-resolution. IEEE Trans- [7] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional actions on Multimedia 24 (2021), 1473–1487. networks for biomedical image segmentation. In International Conference on [14] Kai Zhang, Jingyun Liang, Luc Van Gool, and Radu Timofte. 2021. Designing a Medical image computing and computer-assisted intervention. Springer, 234–241. practical degradation model for deep blind image super-resolution. In Proceedings [8] Edgar Schonfeld, Bernt Schiele, and Anna Khoreva. 2020. A u-net based dis- of the IEEE/CVF International Conference on Computer Vision. 4791–4800. criminator for generative adversarial networks. In Proceedings of the IEEE/CVF [15] Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2018. Learning a single convolutional conference on computer vision and pattern recognition. 8207–8216. super-resolution network for multiple degradations. In Proceedings of the IEEE [9] Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P Aitken, conference on computer vision and pattern recognition. 3262–3271. Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-time single image and video super-resolution using an e�cient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1874–1883. [10] Richard Shin and Dawn Song. 2017. Jpeg-resistant adversarial images. In NIPS 2017 Workshop on Machine Learning and Computer Security, Vol. 1. 8. [11] Xintao Wang, Liangbin Xie, Chao Dong, and Ying Shan. 2021. Real-esrgan: Train- ing real-world blind super-resolution with pure synthetic data. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1905–1914. [12] Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. 2018. Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European conference on computer vision (ECCV) workshops. 16 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 An analysis of time complexity and a discussion on interpretability for two methods of constructing social network graphs Žan Jonke zj0527@student.uni-lj.si Faculty of Computer and Information Science, University of Ljubljana Večna pot 113 1000 Ljubljana, Slovenia Munich Innovation Labs GmbH Pettenkoferstr. 24 80336 Munich, Germany ABSTRACT How data is collected from such platforms and what information Gathering useful information from user interactions on social me- can be extracted is of great utility. Being able to visually consider dia is a challenging task but has several important use cases. For a network and to investigate it manually can be of great impor- example, law enforcement agencies monitor social media for threats tance to analysts. In order for such a visualization to be bene�cial, to national security, marketers use them for launching marketing appropriate parts of the network graph (i.e. important users and campaigns, etc. Since most social media platforms do not provide a content) must visually stand out. Our aim is to provide two ways standardized way of monitoring their data, most analyses are car- of doing so and comparing them to one another from a technical ried out manually. We aim to expedite this process by constructing perspective i.e. analyzing how much time is needed to construct social network graphs, where analysts can visually determine what such graphs and comparing them in terms of interpretability of users and contents are important. In this paper we compare two their visualization. di�erent approaches for constructing such graphs (path-weighted and degree-weighted). We analyze the time complexity of graph 2 RELATED WORK construction and discuss the usefulness of their visualization. In Social network graphs can be constructed based on direct or inferred order to empirically evaluate both approaches, a method was de- relations, including re-posting, replying or mentioning, through the veloped, which stochastically generates data adhering to rules that shared use of hashtags or URLs, reciprocation or minimum levels of govern the generation of data on a social media platform. We found interaction activity, or friend/follower connections [4]. Karunasek- that constructing degree-weighted graphs is faster, although the era et al. [12] constructed networks of Twitter [10] accounts based visualization of a path-weighted graph can answer more questions on re-posts and mentions to discover communities active during the about the dataset. 2017 German election, valuing mentions and re-posts equally. URL sharing behaviour is often studied in the detection and classi�cation KEYWORDS of spam and political campaigns [2, 7, 17]. Social media, network analysis, graph construction complexity Edwards et al. [6] evaluated several di�erent approaches of ex- tracting social network graphs from datasets, which included link- 1 INTRODUCTION ing two people if they were detected at the same event. Nasim et al. [13] introduce an approach of how to detect content polluting Social media platforms have grown in popularity over the last two bots on Twitter. Their approach was to construct a two-mode user- decades, manifesting several new ways of how humans interact event network linking two users if they had posted contents on the with one another. The central idea is that users post contents and same day. Nguyen Vo et al. [16] constructed social network graphs other users interact with them. For example, on Facebook [9] users which helped them evaluate an algorithm for revealing and detect- post photos, write texts, create events etc. and other users like, ing malicious re-posting groups. In their approach they considered comment or re-post the contents. only re-posts between users and for each pair determined re-post Law enforcement agencies monitor social media platforms for similarity and connected them if it was high enough. extremist groups, which might use them for spreading misinfor- mation, incitement of violence or other forms of threat to national 3 METHODS security [8, 14]. The application of social network analysis in marketing can 3.1 Constructing graphs provide marketers with valuable insights for developing commu- In this section we propose two di�erent methods of constructing nication and branding strategies by building up social capital in discrete graphs given a dataset which can be obtained from a social social networking sites [1, 5]. media platform. The dataset contains entities called users and the h�ps://doi.org/10.51939/scores22.05 17 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Žan Jonke content that they generated. Contents can also be reactions to one Share 1 User 3 Comment 1 User 1 User 2 another i.e. a comment or a share. Post 1 Both methods have nodes of classes "user", "content", "comment" Comment 1 Legend (edges): Legend (edges): and "share". The methods di�er in the way how edges are formed Posted by Posted between the nodes, in which direction they are oriented and in the Post 1 Share 1 Reaction to Posted way the node weight is calculated. User 1 Posted by Posted User 3 The �rst method calculates node weights based on their degree User 2 Share of Reacted to (degree-weighted). We present this method in graph theoretic terms Node weight Posted by Shared as follows: Let ⌧3 (+, ⇢) be a directed (degree-weighted) graph. Let * ⇢ + be the set of users, ⇠ ⇢ + the set of contents, ⇠ Figure 1: Example of a path-weighted (left) and degree- 2 ⇢ + the set of comments and ⇠ weighed graph (right) on the same dataset. B ⇢ + the set of shares. Let D 2 * , 2 2 ⇠, 22 2 ⇠2 and 2B 2 ⇠B. Edges {(D,2), (D,22), (D,2B)} ⇢ ⇢ only ifD is the author of 2, 22 or 2B. In this method there are no edges linking directly comments to contents or shares. Such a relation is represented • the number of contents generated, denoted as # as two edges {(D,2), (D,22)} ⇢ ⇢ where D is the author of 22 (the • the probability that the new content generated is a com- same holds for 2B should 22 be a comment on a share). Shares are ment, denoted as ?2 modelled in the same manner. The weights of edges are equal to 1, Our approach makes the assumptions that very little content on however the weights of nodes are equal to the node degrees. social media gains very high attention and that most users either The second method is based on calculating the node weight post content or react to it but rarely both. To re�ect this, each ran- based on the number of simple paths that lead to that node (path- dom content node that got generated got assigned a weight, which weighted). We present this method in graph theoretic terms as was sampled from a Pareto distribution [3], which is a power-law follows: probability distribution that is used in description of social, quality Let ⌧? (+, ⇢) be a directed (path-weighted) graph. Sets and nodes control, scienti�c, geophysical, actuarial, and many other types of are de�ned as above. Edges {(2,D), (22,D), (2B,D)} ⇢ ⇢ only if D is observable phenomena. This weight was used for weighed sam- the author of 2, 22 or 2B and {(22,2), (22,2B), (22,22), (2B,2), (2B,2B)} pling, when assigning user reactions i.e. comment and shares to ⇢ ⇢ only if 22 is a comment to 2, 2B or 22 or if 2B is share of 2 or contents. Each user also got assigned a weight upon creation also 2B . The weights of all edges are equal to 1, but the weights of all sampled from a Pareto distribution, which got used when sampling nodes are equal to the number of directed simple paths ending in users for authors of contents. A second weight is generated, which that node. is equal to the inverse of the previous one and is used when sam- Cycles in the network would prevent the calculation of node pling users for authors of comments. Using an inverse and weighed weights from converging. The proposed connection rules guarantee sampling manifests unsymmetrical behaviour of users in regards to that no cycles exist in the resulting networks. This can be easily seen their posting habits of contents and comments. We observed that in the following way: a node in a cycle must have both incoming most users who post high impact content are more likely to do so and outgoing edges. As such * nodes can not be part of a cycle more often than others. To re�ect this, the weights of all contents (they only have incoming edges). ⇠ nodes may have both incoming got multiplied by the weight of its author (i.e. the user). We also (from comments or shares) and outgoing (to the authors) edges. In wanted to capture the observation that new users are more likely to order for a cycle to be formed there would need to be a directed enter a discourse, when a popular post was made. As such, when a edge from an author of the content to its comment, this is however new random content got generated with a high enough weight, the not possible since such an edge is not de�ned. The same proof can probability of a new user spawning in their respective pool, got in- be applied to both ⇠2 and ⇠B nodes. creased and linearly decayed with each new content generated. We The degree-weighted graph can be easily understood and o�ers also assumed that content shares are more similar to content than quick means for analysis, while the path-weighted graph is a bit they are to comments and therefore introduced a transformation more complex, albeit o�ers more in-depth information. In �gure 1 from content to share, which was dependent only on an individual we illustrate examples for both methods on the same dataset. user and their probability to post a share. For each share, a content was sampled using their respective weights. To take into account 3.2 Generating random data nested comments, these can also be sampled when assigning user reactions, however we observed these are not as frequent and as We want to empirically evaluate the time complexity of construc- such their weights are sampled from the uniform distribution. tion of both graphs and as such need many datasets with di�erent numbers of contents and comments. Datasets from social media 4 RESULTS platforms varying in size might however be governed by di�erent social dynamics, which have an e�ect on the network topology. 4.1 Time complexity of construction Sampling from those datasets hence carries the risk of introducing We assume a dictionary representation of a discrete graph. Com- an uncontrolled selection bias. Therefore we propose a way to arti�- puting a degree-weighted graph takes O(|+ |) time. This can be cially generate data (only systematically biased by the assumptions achieved by iterating over all contents, comments and shares and being made) in a stochastic manner governed by two parameters: adding corresponding nodes and edges. Calculating node weights 18 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 An analysis of time complexity and a discussion on interpretability for two methods of constructing social network graphs Figure 3: Comparison of interpretability of the path- weighted (left) and degree-weighted (right) graph. It is not possible to determine to what content does the comment belong to in the degree-weighed graph. 4.2 Interpretability Here we demonstrate a way how one could interpret the node weights and degrees of both types of graphs. In a degree-weighted Figure 2: Time complexities of graph construction for both graph a content’s weight re�ects the sum of all users that reacted to methods. Both depend linearly on it, with either a comment or a share. The same interpretation can be |+ | + |⇢|. made for both comment and share nodes. A user’s weight re�ects the sum of all their activities. Conversely, in a path-weighted graph a content’s weight re�ects the sum of all comments and shares written as a reaction to it, hence its "impact". As before, the same interpretation can be made for both comments and shares. A user’s takes O(|+ |) time because all that is needed is looping over all nodes weight re�ects the sum of the contents written by the user (the and calculating the node degree, which can be done in constant "activity") plus the sum of the their impacts (total impact), hence time. their (relative) importance. Additionally the degrees of nodes in Constructing a path-weighted graph as described above takes a path-weighted graph can be interpreted as weights of degree- O(|+ |) time since the same concept of looping over all contents, weighted graphs. comments and shares can be used. Now we consider calculating As an edge is missing from either a comment or a share to the node weights. Calculating one simple directed path between two content in a degree-weighted graph, this results in an ambiguity. nodes using depth �rst search takes O(|+ | + |⇢|) time [15]. Naively Consider �gure 3. It shows two visualizations of the same data. For doing so for all nodes and all possible paths results in a combinato- the degree-weighted graph it can not be determined by visualization rial explosion in terms of time complexity. This analysis assumes alone where the comment belongs to, however this is not the case that lengths of paths are comparable to |+ |. However in our case for the path-weighted graph. this assumption can not be made since long paths require nested comments or shares i.e. comments to comments or shares of shares. 4.3 Visualization For simplicity we exclude such paths from our analysis since we Here we show some visualizations of networks using the random observed those are usually not found very frequently in interactions data generating model described above. We generated two datasets on social media. We estimate that calculating all simple paths for all with values of nodes to take # equal to 1000 and 2000 and ? O(|+ | + | |) time, where is a list of all ancestors of 2 equal to 0.8. In �gures 4 and 5 two visualizations of path-weighted and degree- all nodes. We assume that calculating all ancestors of a node to be weighted graphs are shown. The visualizations were done using constant since we are not considering nested comments or shares. a Javascript library called vis.js where graph physics enable for a The following observations can be made: more �exible visualization. The physics solver ForceAtlas2 was used [11]. The tool allows for zooming and makes even larger networks • | | ⇡ |+ | + |⇢| since every edge roughly introduces one new ancestor, still manageable to analyze manually. However, these visualizations take a lot of computing power and it becomes very time consuming • a node might have a large number of ancestors, however each ancestor has at most two outgoing edges and is at to render networks which contain more than 5000 nodes. This issue most at a distance 2 from the source, meaning that depth can be mitigated by using alternative libraries with GPU support. �rst search �nds all simple paths between two nodes in constant time. 5 DISCUSSION We have presented and analyzed two di�erent approaches (path- When |+ | gets large enough, most of the time needed for the compu- weighted and degree-weighted) for constructing directed graphs tation of a path-weighted graph is used for calculating node weights. from data modelling social media networks. We have shown that in Therefore calculating a path-weighted graph takes O(|+ |+|⇢|) time. terms of time complexity the degree-weighted graphs are favourable, An empirical evaluation of time complexity can be seen in �gure since they require less time to construct, although the di�erence 2. Datasets of di�erent sizes were constructed with # ranging from only becomes noticeable when the dataset is very large. 0 to 45000 with a step of 500. At every step we generated 30 random The advantages of degree-weighted graphs are that they are fast datasets using the above described method and measured time of to construct and their visualizations are easy to interpret regarding construction for both approaches. questions about the most popular posts and which users post most 19 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Žan Jonke https://doi.org/10.1145/2806416.2806572 [3] Henry T. Davis and Michael L. Feldstein. 1979. The Generalized Pareto Law as a Model for Progressively Censored Survival Data. Biometrika 66, 2 (2022/09/27/ 1979), 299–306. https://doi.org/10.2307/2335662 [4] Lewis Mitchell Derek Weber, Mehwish Nasim and Lucia Falzon. 2021. Exploring the e�ect of streamed social media data variations on social network analysis. CoRR abs/2103.03424 (2021). arXiv:2103.03424 https://arxiv.org/abs/2103.03424 [5] Shaun Doyle. 2007. The role of social networks in marketing. Journal of Database Marketing Customer Strategy Management 15, 1 (2007), 60–64. https://doi.org/ 10.1057/palgrave.dbm.3250070 [6] Michelle Edwards, Jonathan Tuke, Matthew Roughan, and Lewis Mitchell. 2020. The one comparing narrative social network extraction techniques. In 2020 Figure 4: Visualizations of degree-weighted graphs with # IEEE/ACM International Conference on Advances in Social Networks Analysis equal to 1000 (left) and 2000 (right). and Mining (ASONAM). 905–913. https://doi.org/10.1109/ASONAM49781.2020. 9381346 [7] Fabio Giglietto, Nicola Righetti, Luca Rossi, and Giada Marino. 2020. It takes a village to manipulate the media: coordinated link sharing behavior during 2018 and 2019 Italian elections. Information, Communication & So- ciety 23, 6 (2020), 867–891. https://doi.org/10.1080/1369118X.2020.1739732 arXiv:https://doi.org/10.1080/1369118X.2020.1739732 [8] John S. Hollywood, Michael J. D. Vermeer, Dulani Woods, Sean E. Goodison, and Brian A. Jackson. 2018. Using Social Media and Social Network Analysis in Law Enforcement: Creating a Research Agenda, Including Business Cases, Protections, and Technology Needs. RAND Corporation, Santa Monica, CA. https://doi.org/ 10.7249/RR2301 [9] Meta Platforms Inc. 2004. Facebook. https://www.facebook.com. Accessed: 2022-07-29. [10] Twitter Inc. 2006. Twitter. https://www.twitter.com. Accessed: 2022-07-29. [11] Mathieu Jacomy, Tommaso Venturini, Sebastien Heymann, and Mathieu Bastian. 2014. ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Figure 5: Visualizations of path-weighted graphs with # Visualization Designed for the Gephi Software. PLOS ONE 9, 6 (06 2014), 1–12. equal to 1000 (left) and 2000 (right). The same datasets were https://doi.org/10.1371/journal.pone.0098679 used as in �gure 4. [12] Fred Morstatter, Yunqiu Shao, Aram Galstyan, and Shanika Karunasekera. 2018. From Alt-Right to Alt-Rechts: Twitter Analysis of the 2017 German Federal Election. (04 2018), 621–628. https://doi.org/10.1145/3184558.3188733 [13] Mehwish Nasim, Andrew Nguyen, Nick Lothian, Robert Cope, and Lewis Mitchell. frequently. The corresponding nodes will have a high weight and 2018. Real-Time Detection of Content Polluters in Partially Observable Twit- therefore visually stand out. A disadvantage of this method is that ter Networks. In Companion Proceedings of the The Web Conference 2018 (Lyon, France). International World Wide Web Conferences Steering Committee, Re- further analysis is harder to conduct i.e. it is harder to answer public and Canton of Geneva, CHE, 1331–1339. https://doi.org/10.1145/3184558. questions about the most important users, the impact of individual 3191574 posts and about the users most people interact with. [14] International Association of Crime Analysts. 2018. Social Network Analysis for Law Enforcement [White paper]. (2018). The advantage of path-weighted graphs is for analysts to be [15] Robert Sedgewick. 2001. Algorithms in C, Part 5: Graph Algorithms, Third Edition able to answer the above questions more easily since all relations (third ed.). Addison-Wesley Professional. are unambiguously re�ected by graph edges and more meaningful [16] Nguyen Vo, Kyumin Lee, Cheng Cao, Thanh Tran, and Hongkyu Choi. 2017. Revealing and Detecting Malicious Retweeter Groups (ASONAM ’17). Association weights are given to content and user nodes. However, the disad- for Computing Machinery, New York, NY, USA, 363–368. https://doi.org/10. vantage of this method is that network graph construction takes a 1145/3110025.3110068 [17] Tingmin Wu, Sheng Wen, Yang Xiang, and Wanlei Zhou. 2018. Twitter spam longer time. detection: Survey of new approaches and comparative study. Computers Security Analysing our model for generating random social media data, 76 (2018), 265–284. https://doi.org/10.1016/j.cose.2017.11.013 we �nd that it lacks modelling of phenomena where users are more likely to react with only a small pool of users (their friends and family), rather than with users who post the most popular content. As a next step we plan to test the validity of our �ndings on real social media data and analyse the time complexities of adding new nodes to an existing graph in a realistic monitoring scenario. ACKNOWLEDGMENTS We would like to express our very great appreciation to Mathias Uh- lenbrock and Oussama Jarrousse for their valuable and constructive suggestions during the planning and development of this research project. REFERENCES [1] Ioannis Antoniadis and Anna Charmantzi. 2016. Social network analysis and social capital in marketing: theory and practical implementation. International Journal of Technology Marketing 11 (01 2016), 344. https://doi.org/10.1504/ IJTMKT.2016.077387 [2] Cheng Cao, James Caverlee, Kyumin Lee, Hancheng Ge, and Jinwook Chung. 2015. Organic or Organized? Exploring URL Sharing Behavior. (2015), 513–522. 20 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Empirical evaluation of sequential, parallel and distributed implementations of k-means clustering Andrej Perković Aleksandar Tošić 89201045@student.upr.si aleksandar.tosic@upr.si Faculty of Mathematics, Natural Faculty of Mathematics, Natural Sciences and Information Technologies Sciences and Information Technologies University of Primorska University of Primorska Glagoljška ulica 8 Glagoljška ulica 8 SI-6000 Koper, Slovenia SI-6000 Koper, Slovenia ABSTRACT The placement of such facilities is a logistics optimisation problem In this paper we present a sequential, parallel and distributed im- commonly known as the F������� L������� P������ (FLP). Ex- plementation of the infamous k-means clustering algorithm. We isting research is mostly focused on mathematical modeling and perform extensive testing of all three implementations on state the linear programming to �nd the near optimal positioning of facilities art hardware, and show the performance bene�ts of paralelliza- considering all constraints [1, 3]. However, due to the computa- tion. The research was inspired by a use-case of reverse logistics tional complexity of the problem, such solutions are not scalable optimisation of wood in Germany, which translates to a facility for large logistic networks such as the entire EU zone. location problem. K-means is an heuristic approach that renders K-means clustering has been heavily explored as a heuristic surprisingly good results compared to mathematical modelling ap- approach to solving large problems where linear programming proaches, which are usually not feasible in large inputs as they solvers become infeasible [9]. Clustering is a simple and powerful belong to the class of NP-hard problems. principle for making sense of large swats of data. It is becoming more and more important in today’s data-intensive and data-driven KEYWORDS society [7]. -means clustering is a rudimentary algorithm for achieving this goal. It has been one of the researchers’ favorite, k-means, Clustering, MPI, Parallel computing with a plethora of variations and tweaks [16]. It is widely studied, with the most notable work coming from Lloyd [10], Forgey [5], 1 INTRODUCTION Friedman and Rubin [6] and MacQueen [11]. The concept of reusing waste wood is not a new concept [4]. Over The crux of the algorithm are it’s two steps - the 0BB86=<4=C the years, the concept of reusing or recycling waste wood has (binding) BC4? and the update step. In the former, we assign each been gaining increasing attention both from academia and industry point to the "closest" cluster centroid. In the latter, we recalculate the participates. The potential of reusing waste wood has multiple centroid of the cluster. De�nition of closeness depends on the choice bene�ts such as positive environmental e�ects as less trees need of the algorithm. The algorithm stops once there are no changes to be cut in order to meet the demand of raw materials. Currently, in the binding. Recently, clustering has been used to improve run- in most countries waste wood is collected but rarely sorted or time of M�����I������ L����� M����� (MILP) to give the solver decontaminated. Both of these processes are required before reusing a better initial state then random [2]. waste wood. The process of sorting is important to �lter out wood In this paper we present an open-source parallel and distributed not suitable for reuse, which is mainly due to size constraints, and implementation of the k-means algorithm. We evaluate our imple- type of wood (hardwood/softwood). The decontamination process mentation on the aforementioned use case using data obtained from involves some mechanical cutting and grinding of parts of the the statistical o�ce in Germany. Our initial data-set contains 10.000 suitable waste wood to remove unwanted objects (nails, screws, accumulation sites, which accumulate used wood. The dataset was etc..) and chemical compounds such as adhesives. However, both of prepared by statistically estimating the amount of wood that should these processes are inherently costly, and the capital and operational accumulate in an area based on the population density, and average expenses need to be justi�ed by the added value obtained by reusing waste wood created per inhabitant. Both of the aforementioned waste wood as oppose to buying new raw material [1]. To achieve statistics were obtained from the statistical o�ce of Germany. Each this goal, legislation must both subsidise the transition to circular site is located using GPS coordinates and distances are computed economy and at the same time impose restrictions or taxes on using the Cosine-Haversine Formula [13]. excessive ⇠$2 emissions [14]. Due to lack of investment and motivation by market participants, 2 IMPLEMENTATION most of the waste wood is burned for energy and put into land- To tackle the problem at hand, we decide to use the weighted :- �ll where burning is not an option due to heavy contamination. means clustering algorithm. It’s di�erent form the classical version Sometimes, this is done by accumulation sites directly. in that it takes into consideration the capacity of the facility, not just For a successful implementation, new facilities for sorting and the Euclidean distance. This way the calculation of the centroid of a decontamination must be built. These facilities would then o�er cluster is biased towards the larger collection facilities of that cluster, recycled wood to the market to fund their operational expenditure. which re�ects practical needs of placing the treatment facilities h�ps://doi.org/10.51939/scores22.06 21 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Andrej Perković and Aleksandar Tošić Master Workers(n) Initialize Sites Centroid Selection Binding MPI.bcast( ) Generate Centroids MPI.scatterv( ) MPI.scatterv( ) MPI.al Gatherv( ) Check Stop Condition (n) Terminate MPI.gattherv(flags) Update MPI.bcast( ) MPI.scatterv( ) MPI.gatherv( ) MPI.bcast( ) Figure 2: An outline of the distributed algorithm’s work�ow using MPI. Figure 1: An example of the output results using MPI with 10 clusters of 10 000 sites. need to know location of current centroids. They can safely share this since they only read the values in this step. There is no critical closer to places with more signi�cant accumulation of wood. Hence, sections here regarding sites, since threads are accessing partitions we get the coordinates of the new centroid in the following way: of the Site list, i.e. disjoint sets. On the other hand, the parallel ’ program can get into the race condition in the part of the code where G8 · F8 a Site object is appended to a cluster’s list of the given objects. For G8 2⇠9 this reason, we performed the insertion of sites to appropriate lists ⇠9 (G) = ’ F sequentially after the computations are done. One would argue that 9 G the ArrayList.add() method could have been made synchronous, 9 2⇠ 9 but there are situations where even this can fail. Moreover, doing The new ~ coordinate of the centroid is calculated analogously. this sequentially has a negligible in�uence on the performance. In the assignment step, we assign to each data point the clos- In this mode, the stopping condition is returned as the result est centroid. The Euclidean distance formula for calculating the value of the function distance is used here. We assign point bindCluster(). If it is true, only then do we ?8 to cluster ⇠9 if it holds enter the block of code that initiates the update step. that: q For the update step, we call the method updateCentroid(), 9 = arg min{ (? which very simply creates a Runnable for each cluster and sends it 8 (G ) ⇠; (G))2 + (?8 (~) ⇠; (~))2} ; to the executor. For the initialization, we decided to go with the Frogy method of randomly choosing : points as initial cluster centorids from the 2.2 Distributed mode given data set, which is proven to be the best one for the simple This mode is more "independent", or better put, "self-contained" [12]. :-means [8]. We implemented it with the MPJ express library [15]. Due to the na- ture of message passing and MPJ’s underlying implementation in C, 2.1 Parallel mode it is not possible to pass complex structures between processes, like This mode was implemented using the thread pool principle with Cluster and Site. We can only natively send the primitive types. the help of Executors class [12]. This allows for dynamic control of For this reason, we decided to "serialize" the Cluster and Site threads, so there is less possibility for human error. Synchronization arrays. We transform them into double arrays, centroidBuffer, was done with the CountDownLatch instances called barrierBind represented with orange circles in Figure 2, and siteBuffer, repre- and barrierUpdate, reinitialized before each assignment and up- sented with blue circles, respectively. For Cluster transformation, date step, respectively. we extract the 83, ;0C8CD34 and ;>=68CD34 of each instance. Hence, For the binding step, each thread gets an approximately equally for cluster 8, we have its 83 at position 38, its ;0C8CD34 at position 38+1 sized chunk of Site collection to process. To do this, all threads and its ;>=68CD34 at position 38 +2 in centroidBuffer. Similarly for 22 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Empirical evaluation of sequential, parallel and distributed implementations of k-means clustering the sites, we store the 83 at position 58, ;0C8CD34 at 58 + 1, ;>=68CD34 at 58 + 2, F486⌘C at 58 + 3 and 2;DBC4A ⇡ at 58 + 4 in the siteBuffer for the site 8. Process 0 acts as the master and completes the setup, 75000 that is the transformation into double arrays. After that, we start looping. Processes loop as long as the stopping condition is not sat- is�ed. To check this, there is a separate bu�er for �ags, represented 50000 Version with red and green circles in Figure 2. Each process gets a �ag with distributed the help of the Scatter function to signalize whether it registered parallel sequential a change in its binding step. Runtime (ms) Since the �rst step in the algorithm is assignment, the coordinator 25000 �rst broadcasts centroidBuffer and then scatters the siteBuffer. We used Scatterv for the latter, since the number of sites need not be divisible with the number of processes running. 0 After this, the change �ags are returned to the coordinator. It 0 250000 500000 750000 1000000 decides whether the stopping condition has been reached and broad- Points casts that to the workers, which is represented with the Check Stop Condition box on the Figure. Based on that, we either terminate the Figure 3: Performance evaluation of all three implementa- calculations or proceed to the update step. tions. The results were obtained by increasing the number of For the update step, we decided to implement it in the way that points for each test while keeping the number of centroids all the processes loop through the entire siteBuffer, but only at 100 perform calculations on instances whose 2;DBC4A ⇡ corresponds to the cluster centroid instance the given process is assigned. For this reason, we used the Allgatherv on the siteBuffer right after the 200000 assignment step, but then we used Scatterv on centroidBuffer to assign approximately equal number of clusters to each process. 150000 3 RESULTS To evaluate our implementation, all three versions were imple- Version mented and tested for performance. All tests were performed on distributed 100000 parallel the same hardware namely, two AMD Epyc CPU’s with 64 compute Runtime (ms) sequential cores each, 512GB of RAM running Linux. To test the performance we conduct two separate tests with 20 workers for the concurrent versions. Firstly, we test the impact the number of sites has on 50000 performance by �xing the number of centroids and increasing the number of sites. Secondly, we test the impact centroids have on performance by �xing the number of sites. 0 In Figure 3 we show the scalability of all three implementations. 0 2500 5000 7500 10000 As expected, we observe a reduction in run-time in both parallel Centroids and distributed over the serial implementation. In Figure 3, we can see that the communication overhead in distributed computation Figure 4: Performance evaluation of all three implementa- pays o� only for the heavier half of the test cases. In general, shared tions. The results were obtained by increasing the number of memory should perform much faster then bu�ered IO used by the centroids for each test while keeping the number of points loopback interface. at 100.000,00 Figure 4 shows the impact of centroids on performance. As ex- pected, the sequential version scales linearly, while both parallel How di�erent caching strategies and the Cluster Con�guration and distributed see a marginal hit on performance. of the distributed part, better re�ecting the real-world performance of the aforementioned computing, in�uence on the comparisons 4 CONCLUSIONS AND FUTURE WORK made in this paper; additionally, pushing the boundary of the test What we did expect is for the parallel and the distributed to perform data size further, beyond what can �t in a single computer’s or much better than the sequential. But what was unexpected is the server’s memory, and how does that in�uence the performance of di�erence between MPI’s performance in Figure 3 and 4. It is almost the two would be the subject of our future work. on par with the performance of the parallel version for the case of �xed number of sites, yet far from it in the case of �xed number ACKNOWLEDGMENTS of centroids. We attribute this to caching. Array of sites is much The authors gratefully acknowledge the European Commission bigger than the latter, so having them �xed could be the reason for for funding the InnoRenew CoE project (H2020 Grant Agreement the performance kick. #739574) and the PHArA-ON project (H2020 Grant Agreement 23 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Andrej Perković and Aleksandar Tošić #857188) and the Republic of Slovenia (Investment funding of the their applications. Emerging technology in modelling and graphics (2020), 69–83. Republic of Slovenia and the European Union of the European [8] Greg Hamerly and Charles Elkan. 2002. Alternatives to the k-means algorithm Regional Development Fund) as well as the Slovenian Research that �nd better clusterings. In Proceedings of the eleventh international conference on Information and knowledge management. 600–607. Agency (ARRS) for supporting project number J2-2504. [9] Ke Liao and Diansheng Guo. 2008. A clustering-based approach to the capacitated facility location problem 1. Transactions in GIS 12, 3 (2008), 323–339. REFERENCES [10] Stuart Lloyd. 1982. Least squares quantization in PCM. IEEE transactions on information theory 28, 2 (1982), 129–137. [1] Michael David Burnard, Črtomir Tavzes, Aleksandar Tošić, Andrej Brodnik, and [11] James McQueen. 1967. Some methods for classi�cation and analysis of multivari- Andreja Kutnar. 2015. The role of reverse logistics in recycling of wood products. ate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical In Environmental implications of recycling and recycled products. Singapore : Statistics and Probability, 1967. 281–297. Springer, cop. 2015, 1–30. [12] Andrej Perkovic. 2022. K Means Clustering. https://github.com/AndrejPer/k_ [2] Jean-Thomas Camino, Christian Artigues, Laurent Houssin, and Stéphane Mour- means_clustering gues. 2021. MILP formulation improvement with k-means clustering for the [13] C Carl Robusto. 1957. The cosine-haversine formula. The American Mathematical beam layout optimization in multibeam satellite systems. Computers & Industrial Monthly 64, 1 (1957), 38–40. Engineering 158 (2021), 107228. [14] Erwin M Schau, Črtomir Tavzes, Igor Gavrić, Iztok Šušteršič, Eva Prelovšek [3] Péter Egri, Balázs Dávid, Tamás Kis, and Miklós Krész. 2021. Robust facility Niemelä, Balázs Dávid, Jaka Gašper Pečnik, and David DeVallance. 2022. Envi- location in reverse logistics. Annals of Operations Research (2021), 1–26. ronmental and economic assessment of using wood to meet Paris Agreement [4] Bob Falk et al. 1997. Opportunities for the wood waste resource. Forest Products greenhouse gas emission reductions in Slovenia. In E3S Web of Conferences, Journal 47, 6 (1997), 17–22. Vol. 349. EDP Sciences, 03005. [5] Edward W Forgy. 1965. Cluster analysis of multivariate data: e�ciency versus [15] Aamir Sha�, Bryan Carpenter, and Mark Baker. 2009. Nested parallelism for interpretability of classi�cations. biometrics 21 (1965), 768–769. multi-core HPC systems using Java. J. Parallel Distributed Comput. 69, 6 (2009), [6] Herman P Friedman and Jerrold Rubin. 1967. On some invariant criteria for 532–545. https://doi.org/10.1016/j.jpdc.2009.02.006 grouping data. J. Amer. Statist. Assoc. 62, 320 (1967), 1159–1178. [16] Junjie Wu. 2012. Advances in K-means clustering: a data mining thinking. Springer [7] Attri Ghosal, Arunima Nandy, Amit Kumar Das, Saptarsi Goswami, and Mri- Science & Business Media. tyunjoy Panday. 2020. A short review on di�erent clustering techniques and 24 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Exact Exponential Algorithms for the Center Closing Problem Samo Metličar Jurij Mihelič samo.metlicar@hotmail.com jurij.mihelic@fri.uni-lj.si Faculty of Computer and Faculty of Computer and Information Science, Information Science, University of Ljubljana University of Ljubljana Večna pot 113 Večna pot 113 SI-1000 Ljubljana, Slovenia SI-1000 Ljubljana, Slovenia ABSTRACT • ; 8 P, – In this paper, we focus on the center closing problem which is • 2 P = ( in similar to the well-known :-center problem. Both problems are • 8 , ⌫ 2 P : < ⌫ =) \ ⌫ = ;. de�ned on a network with the goal of optimizing the worst-case From this the formal de�nition of the problem can follow. service time for the clients, but with the di�erence that in the center closing problem several existing centers are closed in order De�nition 2.1 (Center closing problem). Let ⌧ = (+, ⇢) be a com- to optimize total cost of operation. First, we show the plete undirected network, sets of centers N P-hardness ( and consumers ⇠ form a of the problem. Afterwards, we describe several exact exponential partition of + , ⌘ : ⇠ ! R+0 be a non-negative weight function and algorithms for solving the problem. Finally, experimentally evaluate : be a positive integer not greater than |( |. For every subset ' ✓ ( these algorithm on two test scenarios. de�ne the cost as cost(') = max min ⌘(2)3(2,B). KEYWORDS 2 2⇠ B 2(\' center closing problem, exact algorithm, combinatorial optimiza- The problem is to �nd such a set ' ✓ (, where |'| :, which tion, experimental evaluation, set cover minimizes the cost. The problem can also be presented in the decision version by 1 INTRODUCTION adding a bound ⌫ and asking if there exists a solution of cost at Facility location problems deal with the optimality of the placement most ⌫. of objects into space. Its roots stretch at least back to the 17th cen- tury in the form of Fermat’s problem [5] and it has since branched 3 HARDNESS OF THE PROBLEM out signi�cantly. In this paper, we discuss one such problem – the Because of the similarity to the :-center [10] problem and other center closing problem. We de�ne it and its main properties and facility location problems we assume that it is also NP-hard. We propose several exact algorithms for solving it. Those are also im- prove the assumption by reducing the :-center problem to the plemented and tested using di�erent benchmarks. center closing problem in polynomial time. Let ⌧0 = (+, ⇢) be a The problem deals with closing a set number of centers (e.g. post complete undirected network and :0 be a non-negative integer not o�ces) that o�er some service to consumers (e.g. towns) and is greater than |+ |. The :-center problem is to �nd a set % ✓ + , where represented by a network (e.g. road network). |% | :0, which minimizes the cost A lot of other prominent facility location problems can also be found in the literature. One such example is the cost min :-suppliers problem, (%) = max 3 (E, ?). E 2+ ? 2% analyzed by Hochbaum and Shmoys [6]. A similar problem where the metric is limited to the We reduce the !2 space is examined by the authors :-center problem to the center closing problem as of [11]. There are also more complex variations of the problem follows: where objects are more dynamic and can move or change over • network ⌧ contains 2 copies of each node in + and sets the time [7, 8]. Such problems are often solved with integer linear weight of the edge connecting them to 0. The rest of the programs instead of algorithmically. edges are copied from ⇢, • ( = ⇠ = + , ie., the set of nodes represents both the set of 2 PRELIMINARIES centers and the set of consumers, We begin by introducing basic concepts used in the paper. The • ⌘ = 1⇠, ie., all consumer weights are 1, and center closing problem �rst requires a network, which we de�ne • : = |(| :0, ie., the number of closed centers as opposed as a graph with weighted edges. For each pair of nodes to the number of opened centers. D and E the length of the shortest path between them is denoted by 3(D, E). For L���� 3.1. Set %⇤ solves the :-center problem if and only if set each node E in the network and an arbitrary bound ⌫ we also de�ne '⇤ = ( \ %⇤ solves the center closing problem. its bounded open neighbourhood #⌫ (E) as a set of nodes connected to E by a path of length at most ⌫, excluding E itself. Finally, let ( P����. Let us denote by cost1 and cost2 the cost functions of the be a set. We say that the family of sets P forms a partition of set ( :-center problem and the center closing problem. Let % be some (not if the following statements hold: necesserily optimal) solution to the :-center problem and ' = ( \ %. h�ps://doi.org/10.51939/scores22.07 25 Proc. of the 8th Student Computing Research Symposium (SCORES’22), Ljubljana, Slovenia, October 6, 2022 Samo Metličar and Jurij Mihelič Then it holds that |%| :0 and |'| : and ' is thus also a solution 4.3 Branch and Bound to the center closing problem.1 We now observe the cost functions: We upgrade the backtracking algorithm with the branch and bound cost2(') = max min 1 strategy, denoted with + (2)3 (2, B) bnb. During the branching part of the al- 2 2+ B 2+ \' gorithm, a priority queue is maintained determining the order in = max min 3 (E, ?) which centers are processed. On each step, the intermediate solu- E 2+ ? 2% tion is evaluated enabling the use of the bound part of the strategy. = cost1(%). It takes advantage of the fact that the cost is non-decreasing with Lets say there exists '0 < '⇤ s.t. cost2('0) < cost2('⇤). Since respect to closing additional centers. Therefore, if the intermediate the costs are equivalent it follows that cost1(+ \ '0) < cost1(%⇤). evaluation ever exceeds the current lowest cost found, the whole Therefore %⇤ does not solve the :-center problem – a contradiction. branch can be pruned. ⇤ The algorithm starts without an upper bound, meaning it might not prune much until a decent upper bound is determined. To We have shown how to reduce the :-center problem to the center address this, the algorithm �rst solves the problem with an approx- closing problem in polynomial time. We have also shown that the imate algorithm and uses that solution as the starting bound. This cost functions match in both problems, therefore the decision ver- alteration of the algorithm will be abbreviated with bnb+. sions of them can use the same boundary and will return equivalent results. From [9] it then follows that since the decision version of 4.4 Reduction to the Set Cover Problem the :-center problem is NP-complete, this must also hold for the decision version of the center closing problem. Further it follows Here, we develop an algorithm based on the reduction to the de- from [12] that since the decision version of the center closing prob- cision version of the set cover problem, which is de�ned as fol- lem is lows. Let * be a universal set, S be a family of subsets of * , and N P-complete, its optimization version must be N P-hard. ✓ 2 N a boundary. The question is whether there exists such a 4 EXACT EXPONENTIAL ALGORITHMS family M ✓ S of size |M| ✓, that covers the universal set * , i.e., – 4.1 Brute-force Enumeration " 2 M " = * . We reduce the decision version of the center closing problem to the set cover problem as follows: The most straightforward approach to solving the problem is gen- • * = ⇠, i.e., the set of suppliers represents the universal set, erating all subsets ' of set (. Let = = |⇠|, < = |(|, and : = |'|, then • S = {#⌫ (B) | B 2 (}, i.e., the family consists of sets of neigh- there are < possible subsets in total. Each has to be evaluated bors of the existing centers, and : resulting in the time complexity ): (=,<) = <: (< :)=. For a • ✓ = < :, i.e., the bound for the set cover problem. �xed : this results in a polynomial time of O(<:+1=), however in L���� 4.1. Family the worst case scenario where M⇤ ✓ S solves the set cover problem if and : = <2 the time complexity is ex- only if subset '⇤ ✓ (, where {# ponential, namely ⌫ (B) | B 2 ( \ '⇤} = M⇤, solves the O(2
>
stored together with the corresponding Sprague-Grundy val-
><0 if 1 > 1 and 0 > 1 and 0 + 1 is even
ues.
(⌧ ('
D is initially empty.
0,1 ) = >2 if 1 2 or 0 2 and 0 + 1 is odd
>>:1 otherwise.
Algorithm 1 is particularly useful for identifying (most of the
L���� 1.2 (LCTR [5]). Let _ be a partition. Then we have
times in�nite) patterns of winning positions. As we will see in the
(1) _0[8, 9] = _[9,8]0 and
subsequent sections, those patterns are then proved formally, or
(2) (⌧ (_0) = (⌧(_).
posed as an open conjecture. Without the aid of such an algorithm,
computing Sprague-Grundy values would involve exhaustive case-
2 THE COLUMN-ROW GAME
analysis even for a single starting position.
Column-Row game (CR for short) is an impartial, combinatorial
game played on a Young Diagram. A move in CR consists of remov-
3 SPRAGUE-GRUNDY VALUES OF
ing any single row or any single column from the diagram. We do
COLUMN-ROW GAME ON A QUADRATED
not make the distinction between the CR game and the correspond-
PARTITION
ing Young diagram, thus the notations CR(_) and _ are equivalent.
In this section, we present a solution to the
We denote the row and column moves as
Column-Row game
8A and 92 , where 8 and 9 are
on a family of quadrated partitions in terms of traditional
the indices of the row and column which we remove from the dia-
N/P-
positions.
gram, respectively, and write
8
9
_
A! e_for a row move and _ 2! e_
for a column move. Refer to Fig. 4 for a visual representation of the
L���� 3.1. Let & = (G<1
1 , . . . , G<: ) be quadrated, and let ⌧ be a
:
moves of CR.
subposition of &. Then
(1) ⌧ is not quadrated.
(2) There exists a quadrated subposition of ⌧.
(3) ⌧ is an N-position, & is a P-position.
P����. As a consequence of Lemma 1.2, it is enough to consider
only the row moves of
8
&. Suppose ⌧ is obtained by &
A! ⌧
for some 8. We show each item in turn. Item 1 and Item 2 are an
immediate consequence of the properties of quadrated partitions,
while Item 3 is shown by induction.
Figure 4: An example of both types of moves.
(1) Removing a row reduces the number of occurrences