81 DOI: 10.4312/mz.58.2.81-105 UDK 78 (497.11):070.447"1878/1941":004 Computational Research of Music Criticism between 1878 and 1941 in Serbian: What Can We Learn from the Digital Archive of the Historical Newspapers in the Svetozar Marković University Library? Ivana Perković, a Anđelka Zečević b a University of Arts, Belgrade b Serbian Academy of Sciences and Arts, Belgrade ABSTRACT The article deals with the critical writing on music in the digitized newspapers in the Svetozar Marković University Library in Belgrade. The multidisciplinary research aims to connect the quantitative and qualitative analysis of the musical critics in Serbian in the period between 1878 and 1941, and to explore how to combine both the computational and traditional musicological approaches. Keywords: music criticism, natural language processing, text analysis, musicological canon IZVLEČEK Članek se ukvarja s kritiškimi zapisi o glasbi v digitaliziranih časopisih v univerzitetni knjižnici Svetozarja Markovića v Beogradu. Cilj multidisciplinarne raziskave je povezati kvantitativno ter kvalitativno analizo glasbenih kritik v srbščini v obdobju med letoma 1878 in 1941 ter raziskati, kako združiti računalniške in tradicionalne muzikološke pristope. Ključne besede: glasbena kritika, procesiranje naravnega jezika, analiza besedila, muziko - loški kanon MZ_2022_2_FINAL.indd 81 MZ_2022_2_FINAL.indd 81 10. 02. 2023 13:51:57 10. 02. 2023 13:51:57 muzikološki zbornik • musicological annual lviii/2 82 Introduction The process of digitization has affected many fields of musicological research, including the area of music criticism. However, computational research on the reception of musicians, musical pieces and performers, written in Serbian lan - guage, has not been done so far for a number of reasons, the most important of which include: (1) the limited availability of digital resources and (2) the require - ments of the Natural Language Processing (NLP) systems regarding the lan - guage on music in historical sources. Furthermore, musicologists in Serbia often lack the computational and technical knowhow to conduct this type of research, since these fields of knowledge are not part of the regular academic curriculum. The multidisciplinary research carried out for the purpose of this paper – to our knowledge, the first investigation of this kind related to Serbian resources – began with the definition of the repository, which consists primarily of writ - ings related to music criticism in a digital format. The repository of writing includes newspapers published across 63 years – between 1878 (the date of the independence of Serbia) and the start of the Second World War in Yugoslavia (1941). This time span is usually divided into two historical periods: the “long nineteenth century” 1 and the interwar period. A brief presentation of the historical and cultural background in these two periods follows, for a better understanding of the context. The second half of the long nineteenth century was marked by frequent and complex political changes in the states inhabited by Serbs, including in - dependence from Turkish domination (1878), the establishment of the King - dom of Serbia (1882–1918), the change of the Obrenović and Karađorđević dynasties (1903), the Balkan Wars (1912–1913), and World War I. This is the time when important cultural and scientific institutions were founded, among which we find the Serbian Literary Guild (Srpska književna zadruga, 1892), and the Serbian Literary Magazine (Srpski književni glasnik, 1901–1914, 1920– 1941). Significant impulses for the development of science came particularly from the Academy of Sciences and the Higher School (Velika Škola) that was transformed into the University of Belgrade (1905). Culture was also subjected to many changes, while national idealism and Romanticism were of special importance for cultural and artistic life in general. The interwar period brought numerous transformations, within the King - dom of Serbs, Croats and Slovenes (1918–1929) and later the Kingdom of Yugoslavia (1929–1941) resulting in aspirations towards unification, changes of various social structures, the economy, education and other fields. This was 1 The concept of the long nineteenth century in Western history, denoting the period between the French Revolution and the beginning of the First World War (1789–1914), relies on the histori - cal-theoretical model of Eric Hobsbawm. For more information, cf. Phyllis Weliver and Kathari - ne Ellis, eds., Words and Notes in the Long Nineteenth Century (Woodbridge, UK; Rochester, NY: Boydell & Brewer, 2013). MZ_2022_2_FINAL.indd 82 MZ_2022_2_FINAL.indd 82 10. 02. 2023 13:51:58 10. 02. 2023 13:51:58 I. Perković, A. Zečević: Computational Research of Music Criticism between 1878... 83 a time of high achievements of cultural history, and the key words here are certainly Europeanization and modernization. The renewal and establishment of institutions, whether cultural, scientific or educational, affected the over - all level of prosperity, so that new tendencies in art, culture and science were synchronized with similar phenomena occurring in other European countries. The progressive and liberal journal Nova Evropa (New Europe) was founded in Zagreb in 1920, and a year later (1921) the avant-garde journal Zenit (Ze- nith, Zagreb and Belgrade) appeared, while in 1922 Cvijeta Zuzorić Society of Friends of Art was founded, promoting interest in the arts, to name just a few examples. The latest stylistic challenges were present in art (which does not mean that tradition was forgotten), and artists and scientists who had complet - ed their education abroad successfully transferred their latest achievements to the Yugoslav environment. The increasingly developed bourgeoisie provided a good social framework for the establishment of numerous artistic associations, the development of amateurism and the provision of an educated audience. The main aim of the study is to explore how to combine the opportunities offered by computational quantitative analysis with traditional musicological qualitative research methods, and to indicate what this collaboration can bring to music research methods, particularly in terms of potential future research fields. Qualitative observations are combined with quantitative data in order to explore some of the selected topics relevant to the study of music criticism. For the purpose of this paper, three main topics were selected, as the most relevant in traditional musicological research of music criticism in Serbian: (1) the changes in the discourse in the selected historical period, (2) the lan - guage related to the emotional impact of certain performance or a new musical composition and (3) the enrichment of factual knowledge about music history. Possible further research does not exclude other themes and topics that might emerge from the huge amount of gathered material, rather, it is opened to – at the same time – broader and more refined insights, depending on the selected research discourse moving forward. The Creation of the Corpus Within the Europeana, 2 one of the largest projects intended to preserve the rich cultural heritage of Europe, the Svetozar Marković University Library contributed with digitized newspapers that uncover the social, political, and cultural life of the people of those states inhabited by Serbs. This effort included the digitalization of more than 500,000 pages in Serbian, dating from the beginning of the nineteenth century to the present through the lenses of dozens of periodicals. As such, the collection represents a valuable source of knowledge and stories embroidered around important historical events. 2 Europeana, https://www.europeana.eu/en . MZ_2022_2_FINAL.indd 83 MZ_2022_2_FINAL.indd 83 10. 02. 2023 13:51:58 10. 02. 2023 13:51:58 muzikološki zbornik • musicological annual lviii/2 84 All the digitized newspaper pages can be accessed via the offi cial web application, 3 where users can browse content and visually explore the pages using a number of functionalities. 4 When it comes to the textual content, the web application allows users to search through the digitized collection by en- tering query terms and selecting the options for narrowing the specifi c news- paper source and/or its period. If a user enters more than one query term, by default newspaper pages containing all the terms are searched. In addition, the application supports an exact match search by enclosing the query terms in quotation marks (for instance, “koncertna muzika”), a search by a disjunction of terms by using the/or OR delimiter (for instance, “svirati”/“pevati”), as well as search for pages that do not contain the given query term by using a prefi x - (for instance, “-muzika”). Although very helpful, these search mechanisms do not cover all the relevant retrieval scenarios, for example, search by all morpho- logical word forms. Th e option “Search for similar words and expressions” al- lows an extended search based on the edit distance between given query terms and search terms, but off ers no means of confi guration or additional control. 3 Pretraživa digitalna biblioteka, https://pretraziva.rs/search. 4 As stated, these functionalities are partially based on the open source BnLViewer developed by National Library of Luxemburg available at “BnlViewer,” Sourceforge, https://sourceforge.net/ projects/bnlviewer/. Browse panel Preview panel (available in Serbian only) MZ_2022_2_FINAL.indd 84 MZ_2022_2_FINAL.indd 84 10. 02. 2023 13:51:58 10. 02. 2023 13:51:58 I. Perković, A. Zečević: Computational Research of Music Criticism between 1878... 85 Figure 1: Application provided by the Svetozar Marković University Library. Upon retrieval, the application generates a list of relevant newspaper pages followed by the number of results. Each newspaper page is described with the appropriate metadata and a link to the page image that can be visually explored by using functionalities such as page zooming, page rotation, or page scroll- ing. In order to collect and analyze the text content of the pages, we needed to develop an application programming interface (API) on top of the results list. As far as we know, no offi cial API provided by the Svetozar Marković University Library supports this kind of content aggregation. Th erefore, for the seed query terms, the selection of which will be described later, a pool of HTTP requests is sent to the web application: the HTTP requests are gener- ated recursively following the links present in the result list in the spirit of the best crawling practices. Th e textual content of the newspaper pages is extracted from the HTTP response jointly with the metadata prepared for the interac- tion between the user and the application via a web browser. Th e initial corpus encompassed the content of newspaper pages obtained using the query term “muzika” (music) with the default option for similar words and expressions search as Serbian follows an infl ection paradigm with nouns characterized by cases, gender, and number. In total, the corpus contained a text content of 16,843 newspaper pages. As expected, some of the pages in the corpus referred to relevant morphological forms of the word “muzika” such as “muzike” or “muzikom.” However, there were a large number of pages contain- ing word forms similar to the word “muzika” in respect to the edit distance, such as “fi zika” (physics) or “jezika” (language) but with no overlapping se- mantics. Separate requests for each morphological form of the word “muzika” would generate a signifi cantly larger number of results, in total 42,052 pages. As this corpus was intended for familiarization with writing on music, we decided to keep the smaller of the two but to perform separate searches for future retrievals. Search panel MZ_2022_2_FINAL.indd 85 MZ_2022_2_FINAL.indd 85 10. 02. 2023 13:51:58 10. 02. 2023 13:51:58 muzikološki zbornik • musicological annual lviii/2 86 By manual analysis of the pages in the corpus, we discovered plenty of re- sources related to musical education, theatre performances, ceremonies, and other similar activities not directly relevant to the study of music criticism. However, the analysis pointed us to the language of music critics – the past forms of the complete verbs such as “otpevao” ([he/she] sang), “odsvirao” ([he/she] played), “izveo” ([he/she] performed) as well as the expressions “iz- vedba je bila/izvođenje je bilo” (the performance was […]), “nastupio je” (he performed) that were very frequent. Th is motivated us to use these verbs, their morphological forms, and both their Ekavian and Ijekavian pronun- ciation variants to create the new corpus. Th e total number of queries was 24 resulting in 3,442 pages. We were aware that this corpus is by no means complete but that it would provide us with a more accurate collection to test the usefulness of NLP methods in the study of musicology. For a number of collected pages, the metadata was missing, in most cases the exact publica- tion date or the title of the periodical. We fi ltered the results to fi t into the timeline range from 1 January 1878 to 6 April 1941. In total, 150 pages were eliminated. As stated, one search result refers to a single page of a digitized news- paper that contains the given query terms. Th e query appearance is visually emphasized by a red rectangle around the query terms. From the web appli- cation, a user can click and single out only the page segment with the given query term, zoom in on it and read its textual content. Depending on the source and page layout, the segment is usually a text paragraph, a newspaper headline, a title, or an image caption. Th e page segmentation is not absolutely accurate, as we noticed numerous adjacent horizontal or vertical segments that merged. An example of such a scenario can be seen below followed by the extracted and intertwined text. Th e more complex the page layout and its visual presentation are, the more challenging the segmentation and NLP processing are. Figure 2: Th e page segmentation. 5 5 Vreme (May 20, 1927), 4; http://istorijskenovine.unilib.rs/view/index.html#panel:pp|issue: UB_00043_19270520|article:div540|page:4|query:%D0%BE%D0%BB%D0%B3%D0%B0 %20%D0%BB%D0%B8%D0%BC%D0%B8. MZ_2022_2_FINAL.indd 86 MZ_2022_2_FINAL.indd 86 10. 02. 2023 13:51:58 10. 02. 2023 13:51:58 I. Perković, A. Zečević: Computational Research of Music Criticism between 1878... 87 Each segment has its own unique identifi er that becomes known only after a user clicks on it and selects it for detailed viewing. We observed this behaviour following the URLs in the application address bar. On some pages, segments with consecutive identifi ers refer to two adjacent vertical paragraphs, while on other pages, this relates to two adjacent horizontal paragraphs. Th erefore, it was not possible to correctly calculate the segment identifi er and single out the exact text block it contains programmatically. In order to approximate the no- tion of the segment that corresponds to the query terms, we followed the left and right contexts represented by the number of characters. We will refer to these segment approximations as text spans throughout the rest of this paper. We have experimented with longer text spans with a context length of 300 characters and shorter spans with a length of 150 characters. Th e longer spans turned out to be more informative for further analysis, but we have used the shorter spans too for the sentiment evaluation. We fi xed the number of Figure 3: Th e distribution of spans with respect to the publication year and the source. MZ_2022_2_FINAL.indd 87 MZ_2022_2_FINAL.indd 87 10. 02. 2023 13:51:58 10. 02. 2023 13:51:58 muzikološki zbornik • musicological annual lviii/2 88 characters instead of the number of tokens due to the errors caused by Opti - cal Character Recognition (OCR) – many punctuation characters are misused and make the segmentation of text content in both sentences and tokens more difficult. The final repository contains 3,751 text spans in total. On the previ - ous page, you can find the distribution of the spans with respect to both the publication year 6 and the source. 7 Text Preprocessing and Quality Analysis The first text preprocessing steps were related to the replacement of HTML entities with appropriate characters (for example, "e; was replaced with “) or their deletion (for example, the entities @ and ®). We also deleted those characters that do not belong to the extended ASCII coding scheme, which were primarily related to document layout (for example, various bullet points such as □) or Church Slavonic texts (for example, the letter Ю). As the digitization spans over an extended period, covering sources with diverse layouts, typography, and design, the number of OCR errors noticed in the created repository is not negligible. Similarly to other projects of the same kind, 8 we have seen OCR errors related to: 1) Over-segmentation, that is the segmentation of proper words such as “klav ira” instead of “klavira” (piano); 2) Under-segmentation, that is the omission of delimiters such as, for exam - ple, “otpevaopesmu” instead of “otpevao pesmu” (sang the song); 3) Misrecognized characters, for example, “hi.mnu” instead of “himnu” (anthem); 4) A missing character such as “intelekualna elia” instead of “intelektualna elita” (intellectual elite); 5) Hallucination such as “otpevao jednu sv? I.mprovizašču” instead of “ot - pevao jednu svoju improvizaciju” (meaning [he/she] sang one of his/her improvisations). Many OCR errors were connected to the misuse of punctuation marks. The relative percentage of the punctuation marks with respect to the total num - ber of alphanumerical characters given below confirms this statement (the total percentage of punctuation marks is 4.64 %). Due to the importance of 6 For the year 1935, the number of digitized newspapers is lower than for the remaining years. 7 Others refers to those sources with fewer than ten spans per source. The total number of sources that satisfy this criterion is 44. The aggregation was done to provide a more accessible preview. 8 Elizabeth Soper, Stanley Fujimoto, and Yen-Yun Yu, “BART for Post-Correction of OCR Newspaper Text,” in Proceedings of the Seventh Workshop on Noisy User-Generated Text (W-NUT 2021), (Association for Computational Linguistics, 2021), 284–290, https://aclanthology. org/2021.wnut-1.31/ , DOI:/10.18653/v1/2021.wnut-1.31 . MZ_2022_2_FINAL.indd 88 MZ_2022_2_FINAL.indd 88 10. 02. 2023 13:51:58 10. 02. 2023 13:51:58 I. Perković, A. Zečević: Computational Research of Music Criticism between 1878... 89 punctuation for the NLP pipeline, particularly for the sentence splitting and tokenization stages, in the preprocessing stage, we kept only the elementary punctuation marks such as sentence delimiters (.!?) and commas. Figure 4: Relative percentage of punctuation marks. In addition to varying publishing practices, the corpus of critical writing on music refl ects diverse linguistic properties, as both Cyrillic and Latin alphabets are used for the transcription of Serbo-Croatian, as well as its distinct Ekavian and Ijekavian dialects. Th erefore, the preprocessing stage required the trans- literation of the collected spans into a common alphabet – for simplicity and alignment with other tools, we decided to use the Latin alphabet. 9 To assess the quality of the text content, we analyzed the sets of short and long words. Th e experiment started with words of a length of less than four characters. Th is list was quite long, with a total of 85,165 words. However, the number of unique words was signifi cantly lower, in total only 2,674. We manually checked the most frequent short words and handcrafted the rules for the correction of OCR errors wherever possible. We noticed many common prefi xes and suffi xes probably caused by over-segmentation due to newspaper layouts and column structures. In addition to short words, we analyzed the long words with a length of greater than 15 characters. Th is list contained 516 unique words. Th e majority of these words were related to hallucination, for example, eieiemei6ieteiemeiem9i6teieiei9iei6k, and helped us to identify the 9 For the transliteration we used the bi-directional transliterator for Python available at https:// github.com/barseghyanartur/transliterate . MZ_2022_2_FINAL.indd 89 MZ_2022_2_FINAL.indd 89 10. 02. 2023 13:51:58 10. 02. 2023 13:51:58 muzikološki zbornik • musicological annual lviii/2 90 low-quality text spans and exclude them from further analysis. The rest of the words were primarily correct compound words. Exploratory Analysis of the Corpus In order to better understand the thematic structure of the extracted spans, we used k-means clustering. 10 The k-means algorithm is traditionally used in exploratory data analysis to cluster the elements by a given similarity criterion into k distinct groups. We ran three experiments characterized by different span representations: n-grams of varying length, Tf-Idf, and Tf-Idf with lem - matized tokens. In all the experiments, we used the Euclidean distance as met - rics and an elbow method to calculate the optimal number of clusters. In n-gram-based experiments, we used n-grams of varying lengths, ranging from two to four. This approach is appropriate as it is not based on specific vo - cabulary and can adapt to various OCR errors. However, the results obtained in these experiments could not be interpreted easily. Still, the investigation was very informative as we singled out frequent n-grams that reflect OCR errors and that can be used for text quality improvement in the future. Using Tf-Idf representations without prior lemmatization, we obtained the vocabulary with 9,330 tokens. All the tokens were lowercase and all punc - tuation marks were eliminated, as well as those tokens with fewer than three occurrences or more than 80 % occurrences. Analyzing the tokens with fewer than three occurrences, we noticed additional OCR errors and the morpholog - ical forms that refer to the same lemma. As the number of the tokens is quite large – close to 50,000 – once again we confirmed the importance of OCR im - provement and the use of lemmatization. The tokens appearing in more than 80 % of spans were also manually analyzed and used for the extension of the final stopwords list. For example, indefinite pronouns such as “svaki,” “neki,” “ovako” and “ovo” were extracted from this list. Following the exact setup, we used Tf-Idf representations with lemma - tization. 11 The resulting vocabulary formed a total of 5,993 tokens. The lists of stopwords and out-of-vocabulary words were still quite large, with close to 35,000 words in total. We also partially manually examined the content of these lists and incorporated the conclusions into the extension of the final stopwords list and more fine-grained text cleaning practices. For cluster visualization, we used word clouds. 12 Observing the visualiza - tion, we concluded the following: 10 We used Python library scikit-learn available at https://scikit-learn.org/stable/ . 11 For lemmatization and part-of-speech tagging we used Classla Python package available at htt- ps://pypi.org/project/classla/ . 12 We used Python library WordCloud available at https://amueller.github.io/word_cloud/index. html . MZ_2022_2_FINAL.indd 90 MZ_2022_2_FINAL.indd 90 10. 02. 2023 13:51:58 10. 02. 2023 13:51:58 I. Perković, A. Zečević: Computational Research of Music Criticism between 1878... 91 1. One of the queries used for the repository creation (the query “izvedba”) in the selected timespan was used to describe construction works and fi ne handcrafts. We noticed these through qualifi ers such as “jeftin” (meaning cheap), “cena” (meaning price), “kvalitet” (meaning quality), and “garancija” (meaning warranty) which are not usually found in the context of mu- sic. Th erefore, we marked these text spans as irrelevant and excluded them from further analysis. 2. Th ere are repetitive spans in the repository due to advertisements published by various sources. Th is behaviour is refl ected by a larger font size in the word clouds, most especially, by the higher frequency of the words in the advertisements. We kept only the fi rst span occurrence and eliminated the duplicates. 3. Th e group of spans included words in Slovenian, such as “tekmovalec” (meaning competitor) and “vaje” (meaning exercises) which appeared in the word cloud. As we intended to use the work only for material in Serbian, we put aside these spans for future research. Depending on the text span representations, the number of clusters ranged from nine to twelve. Some of the clusters could easily be named: Jewish cul- ture, music education, religious ceremonies, choir performances, and so on. Below, you can see the word cloud (in Serbian) devoted to opera performances with the reception of music such as “izvrsno” (excellent), “uspeh” (success), “umetnički”(artistically), “ prijatan” (agreeable) as well as the more mature lan- guage related to the technical description of the performances such as “arija” (aria), tenor (tenor), “program” (programme), “tačka” (a piece in a concert pro- gramme), “glas” (voice) and various others. Figure 5: Th e word cloud related to opera performances. MZ_2022_2_FINAL.indd 91 MZ_2022_2_FINAL.indd 91 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 muzikološki zbornik • musicological annual lviii/2 92 Multidisciplinary Considerations Changes in the Discourse between 1878 and 1941 In this section, historical changes to the discourse in music criticism in Serbian be - tween 1878 and 1941 are analyzed, based on the data gathered through the process described in the previous sections of the study. The purpose is to consider whether the canon that is shaped through traditional musicological research is reinforced or contradicted by the quantitative analysis of the selected digital corpora. Regarding the canon, we rely on the work of Roksanda Pejović, the most prolific researcher of Serbian music criticism from the historiographic perspective, 13 who identified certain fundamental changes occurred through - out the observed historical period, between 1878 and 1941. As a result, she asserts that there are notable differences between the approaches of the long nineteenth century, and the interwar period, respectively. 14 Drawing on the wide selection of material at her disposal, and the meth - odology of the “close reading” of journals and newspapers, which is typical of musicological research, 15 Pejović pointed out that “Serbian music criticism was in its infancy in the nineteenth century.” 16 In her opinion, the critics of this period were mostly journalists with only modest musical knowledge, and their approach to music relied on theatrical criticism; otherwise, many of them simply repeated the words of authoritative commentators on music. In both cases, the emphasis was on the cultural event itself, while expert assessment was either missing or disregarded. Musical amateurs expressed national ideas and coloured their language with romantic nuances, while also taking care to enlighten and instruct their readers. On the other hand, after the First World War, important changes were in - troduced through the professionalization of music critics, and Yugoslav musical 13 Roksanda Pejović published twelve books and many articles on musical performance and criticism in Serbian covering the period between the nineteenth and early twenty-first centuries. Some of them are: Roksanda Pejović, Srpsko muzičko izvođaštvo romantičarskog doba (Beograd: Univerzitet umetnosti, 1991); Roksanda Pejović, Kritike, članci i posebne publikacije u srpskoj muzičkoj prošlosti (1825–1918), (Beograd: Fakultet muzičke umetnosti, 1994); Roksanda Pejović, Opera i balet Na- rodnog pozorišta u Beogradu (1882–1941), (Beograd: [s. n.], 1996); Roksanda Pejović, Muzička kri- tika i esejistika u Beogradu (1919–1941), (Beograd: Fakultet muzičke umetnosti, 1999); Roksanda Pejović, Koncertni život u Beogradu (Beograd: Fakultet muzičke umetnosti, 2004). 14 The borders between these two periods are permeable: certain elements of the new professional approach may be noted before 1914, and vice versa – certain relics are present in the interwar period. 15 We refer here to the concept of “distant reading” created by Franco Moretti, based on quantitative and computational methods, and used to study British novels of the eighteenth and nineteenth centuries. He introduced “distant reading” as an opposition to the practice of the “close reading,” which implies the careful analysis of the text, where attention is paid to critical interpretation and significant details. Cf. Franco Moretti, Distant Reading (London; New York: Verso, 2013). 16 Pejović, Kritike, članci i posebne publikacije, 2. MZ_2022_2_FINAL.indd 92 MZ_2022_2_FINAL.indd 92 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 I. Perković, A. Zečević: Computational Research of Music Criticism between 1878... 93 criticism came much closer to the typical style found in European journals and newspapers. 17 The most important goal of the writers was, as Aleksandar Vasić noted in his study dedicated to Serbian music criticism in the first half of the twentieth century, related to The Serbian Literary Magazine (Srpski književni glasnik, 1901–1914, 1920–1941), to educate readers and to include the analyti - cal perspective in their texts. 18 The change in the profile of the critics is also important: instead of the journalists and mostly musical amateurs found dur - ing the previous period, in the interwar times music criticism was in the hands of professional musicians as well as educated intellectuals: literary critics, poets and writers, historians, and even lawyers. 19 In the following table we offer a summary of some of the basic differences between the two periods in question. Table 1: A comparative overview The Long Nineteenth Century The Interwar Period Critics Amateurs, journalists specialized in theatre criticism Professionals and educated intellectuals (writers, literary critics and historians) The focus of critical writing The cultural event The musical piece and performance Approach to the musical event Descriptive More analytical, with moderate and appropriate use of musical- technical attitudes The typical language Showing “romantic turns” A literary style Inevitably, a lot of questions arise regarding the computational exploration of the canon. Do digital corpora confirm changes in the type of description between the two periods, from mainly descriptive, generally nontechnical to a more music- specific and musically competent style of writing, or do they destabilize previous findings? What kind of aid can exploit the resources of digital newspaper archive offer to the qualitative research that has already generated certain conclusions? At this point in our research, bearing in mind all the methodological concerns and limitations imposed by the text quality, the main contribution which moves beyond the traditional approach is the ease with which digital corpora allow the searching of content and the increased breadth of the access available. 17 Pejović, Muzička kritika, 9. 18 Aleksandar Vasić, “Serbian Music Criticism in the First Half of the Twentieth Century: Its Ca - non, Its Method and Its Educational Role,” Muzikologija, no. 8 (2008): 202. 19 Pejović, Muzička kritika, 8. MZ_2022_2_FINAL.indd 93 MZ_2022_2_FINAL.indd 93 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 muzikološki zbornik • musicological annual lviii/2 94 To illustrate this point, we referred to the content analysis of the assem - bled digital corpus and conducted a search for the term harmony, including adjectives and adverbs related to this word. Critical comments related to this particular term were chosen because the competences related to music har - mony are among those parameters that are demarcating two analyzed peri - ods. As mentioned, music writers in the long nineteenth century had modest music-theoretical knowledge, and it is expected that their comments related to harmony are mostly of general nature. On the contrary, it is presumable that representatives of the interwar period demonstrated more sophisticated knowledge on the music harmony in their critical writings. The term harmony appears seven times in the analyzed material: twice in the first, and five times in the second phase. In terms of the Serbian historical lan - guage, several variants of these terms were used: not just the adjective “harmon - ski,” but also “harmonijski” and even “harmoničeski” with all the propagations of the term. However, the word can have multiple meanings that are not differenti - ated by the NPL methods and tools: (1) it can pertain to choral singing, mostly related to the musical practice of the Orthodox Church, 20 (2) it can demonstrate performance qualities, in terms of a united and well-rehearsed ensemble (in the analyzed cases this meant choirs) and (3) a musical “vertical,” in accordance with the common understanding of the term. Examples of the first understanding are not present in the digital resources covering the selected time span, resulting in the exploration of the second and the third use of the word harmonija. The following citation comes from the critical comment on the participation of the student choir in the divine service in the Church of the Ascension (Vazne - senjska crkva) in Belgrade in 1890. According to the anonymous critic, the music for the Liturgy of St. John Chrysostom composed by Kornelije Stanković was sung with great precision, in a church building full of faithful citizens. We are pleased to emphasize that only this liturgical music based on the folk tunes can arouse a feeling of piety among Serbs. Krestu tvojemu poklanjajemsja vladiko [At the Most Holy Cross of our Savior] is sung so beautifully in the traditional spirit and full, appropriate harmony (especially the bass was prominent), that one cannot stop listening to it. 21 Obviously, the romantically inclined critic was attracted by the coordina - tion of the voices and musical skills of the group of performers; the syntagma 20 This is mostly related to terms “harmoničeski” and “harmonijski”. Cf. Ivana Perković Radak, Od anđeoskog pojanja do horske umetnosti: Srpska horska crkvena muzika u periodu romantizma (do 1914. godine), (Beograd: Fakultet muzičke umetnosti, 2008), 5–8. 21 “Sa zadovoljstvom naglašujemo, da samo ova narodna služba može u Srbinu pobuditi osećaj po - božnosti. – ‘Krestu tvojemu poklanjajemsja vladiko’ je tako divno otpevano u narodnom duhu i punoj, prikladnoj harmoniji (osobito se basovi pokazaše), da se dovoljno naslušati ne mogosmo.” Anon., “Beogradske vesti,” Male novine (March 15, 1890): 3, http://istorijskenovine.unilib.rs/ view/index.html#panel:pp|issue:UB_00031_18900315|page:3|query:otpevano . MZ_2022_2_FINAL.indd 94 MZ_2022_2_FINAL.indd 94 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 I. Perković, A. Zečević: Computational Research of Music Criticism between 1878... 95 appropriate (“prikladna”) harmony, particularly bearing in mind the comment on the bass part, is not related to the “vertical” musical component, but to the expressive and unified sound of the choral ensemble. The second example is the critical review of the Academic Choral Society Obilić performance in February 1921. The music critic, Petar Krstić, offered a following comment: Njest svjat [No One is as Holy] by Mokranjac and Svjati Bože [Holy God ] by Hristić were performed, which sounded a bit thin in the “Obilić” choir, then Romarska popevka by Štolcer for a female choir and tenor, done homophonically, with interesting harmony and dynamics […] Štolcer’s Ej sastali se čapljinski Tatari [Hey, the Tatars of Čapljina Met] (a folk song from Southern Serbia) was in unison, too, so that the composer did not offer anything of his own, except for the tones held in certain parts. Both the piano accom- paniment and the harmonic arrangement are too primitive. 22 As previously mentioned, during the first phase the word harmony was used less often than later to denote the compatibility of the members of the musical ensemble (“appropriate harmony”) or general musical euphony. In the interwar period, the word “harmony” with its propagations was used more often, and its meaning mostly resonates with the contemporary implications of the word. As expected, and shown in the Krstić’s quotation, writers are commenting on harmony as it related to musical pieces much more often than in terms of performance. We presume that similar search of specialized musical terms, just like “harmonija,” would gain similar results. In addition to the phenomenon of the word harmony, we explored the qualifiers, the distinct tokens, in the text spans for both periods and compared their frequencies. In our corpus, the long period is less covered by text spans with total of 290 spans. The interwar period is more elaborated, with 1,698 text spans. We used the part-of-speech tagger in order to single out the list of nouns and adjectives/adverbs related to the query term as appropriate. The numerous OCR errors propagated through the pipeline resulting, again, in approximate tags. Despite this fact, we managed to single out the qualifiers “besprekorno” (flawlessly), “solistički” (soloist), “filharmonija” (philharmonic), “tehnika” (technique), “homofono” (homophonic), “zvučno” (sonant) and “ton” (note of a certain pitch) as well as the voice types tenor, baritone, and bass that uniquely characterize the interwar period. 22 “Od crkvenih i pobožnih pesama izvođene su ‘Njest svjat’ od Mokranjca, ‘Svjati Bože’ od Hristića, koje je u horu ‘Obilića’ zvučalo nešto mršavo, zatim ‘Romarska popevka’ od Štolce - ra za ženski hor i tenor, rađena homofono, harmonski i dinamički zanimljiva […] Štolcerova ‘Ej sastali se čapljinski Tatari’ (narodna pesma iz Južne Srbije’) i suviše je unisono držana, ta - ko, da kompozitor nije uneo ničeg svoga, do držanih tonova u pojedinim glasovima. I klavir - ska pratnja i harmonska obrada je suviše primitivna.” P. J. Kr., “Koncert Akademskog pevačkog društva ‘Obilić,’” Pravda (February 23, 1927): 5, http://istorijskenovine.unilib.rs/view/index. html#panel:pp|issue:UB_00042_19270223|page:5|query:otpevane . MZ_2022_2_FINAL.indd 95 MZ_2022_2_FINAL.indd 95 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 muzikološki zbornik • musicological annual lviii/2 96 Our findings confirm the previous findings that the nineteenth century approach was more informative and educational in nature, while music critics from the time of the first Yugoslavia paid more attention to the technical and stylistic layers of the musical pieces and performances. In other words, the corpus-wide analysis confirmed the musicological canon, or, as Estelle Joubert in her study dedicated to the “distant reading” practice of French music criti - cism, noticed, “it seemed to reinforce it.” 23 However, her suggestion that cer - tain future machine learning projects may offer further opportunities for future quantitative research in music criticism is completely reasonable and equally applicable in terms of the findings of our research. The Emotional Impact Unfortunately, it was not possible to apply NLP methods to the sentiment analysis of music critical writings in Serbian, as a result of the many errors caused by the OCR tool, as already explained, and the domain shift of the existing models for sentiment analysis. It would be interesting to deal with the computational analysis of the emotions expressed in the text spans and the accompanying vocabulary. The exploration of emotional keywords and adjec - tives, used by Serbian critics in their writing on music, would be beneficial in a number of ways 24 but – regrettably – the current state of the created corpora does not provide sufficient opportunities for this type of research. As is widely known, emotions are, as Juslin clearly states, regarded as the key aspect of the musician’s or listener’s “experience, even the main motive for listen - ing to music.” 25 Even if the use of the corpus for research in this field is limited in the current phase of our research, it is intriguing to offer a glance at some of the music-induced emotions described and commented upon by musical critics who wrote in Serbian under the influence of the Romantic movement. The collected data provided an interesting starting point for identifying evidence for one of the most fascinating themes in music research: the topic of music-induced tears/crying. As is well known, one of the most common strong experiences related to music is expressive behaviour and a physical reaction to music in the form of tears. 26 Tears, as Gabrielsson notes, are one of the most frequent physiological reactions, varying from a mild response (moist eyes) to 23 Estelle Joubert, “‘Distant Reading’ in French Music Criticism,” Nineteenth-Century Music Review 19, no. 2 (2021): 291–315, DOI:/10.1017/S1479409820000476. 24 Cf. Yao Liang and Hu Wang, “Sentiment Analysis of Music Criticism Based on Data Mining,” Advances in Social Science, Education and Humanities Research (Atlantis Press, 2018): 368–371, DOI:/10.2991/iceemr-18.2018.84 . 25 Patrik N. Juslin, Musical Emotions Explained: Unlocking the Secrets of Musical Affect (New York: Oxford University Press, 2019), 8. 26 Alf Gabrielsson, “Emotions in Strong Experiences with Music,” in Music and Emotion: Theory and Research, Series in Affective Science, edited by Patrik N. Juslin and John A. Sloboda (New York: Oxford University Press, 2001), 549, DOI: /acprof:oso/9780199230143.001.0001 . MZ_2022_2_FINAL.indd 96 MZ_2022_2_FINAL.indd 96 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 I. Perković, A. Zečević: Computational Research of Music Criticism between 1878... 97 intense crying. Moreover, it is interesting that tears can occur both in terms of showing positive and negative emotions: positive because of the quality of the music or the performance, and negative when “the music was associated with a sad event.” 27 Our selection was formed by several factors: the particular topic is underexplored (the most widely researched emotions are happiness, sadness, fear, valence, and arousal) 28 not only in the framework of national musicologi - cal research, but also from the historical perspective. It is also provocative from the contextual point of view, bearing in mind the complex historical, social and cultural situation in the selected time span. Tears, as Sloboda points out, “may relate to emotions provoked by endings (whether of loss or the relief ), and the precipitating musical structures may be those which encourage the listener to anticipate an impending ending or re - lease of tension.” 29 However, nine examples found in our repository do not re - late to the musical structure at all, but rather to extra musical reasons: religious belonging (three examples), the political importance of certain people (two examples), singing with a “gusle” accompaniment (two examples), national pride (one example) and institutional belonging (one example). However, the broadening of the research corpus might bring certain other results that would further confirm Sloboda’s thesis. As an example, a critical comment on the concert given by Stevan Deskašev in Skoplje in 1896 illustrates well the idea of musical criticism as an agent of Romantic nationalism: Mr. Deskašev opened the concert with the Ottoman anthem, which was interrupted by a loud Padishah Chok Yasha! [Long live the Emperor!] . The crown of our artist’s concert was when he sang Serbian folk songs at the end. His patriotic act was very important for Serbian issues. Especially when he sang Gusle moje, the Serbs in the audience cried. 30 Romantic lyrics written by the famous Serbian poet Branko Radičević in his poem dedicated to the national instrument – the “gusle” (the single stringed bowed instrument used for the accompaniment of Serbian epic poetry) on 27 Alf Gabrielsson and Siv Lindström Wik, “Strong Experiences Related to Music: A Descriptive System,” Musicae Scientiae 7, no. 2 (2003): 170, DOI:/10.1177/102986490300700201 . 28 Kazuma Mori and Makoto Iwanaga, “Two Types of Peak Emotional Responses to Music: The Psychophysiology of Chills and Tears,” Scientific Reports 7, no. 1 (2017): 460–463, DOI:/10.1038/ srep46063 . 29 John A. Sloboda, “Music Structure and Emotional Response: Some Empirical Findings,” Psycho- logy of Music 19, no. 2 (1991): 120, DOI:10.1177/0305735691192002. 30 “G. Deskašev je otvorio koncerat otomanskom himnom, koja je sa gromkim ‘Čok-jaša Padišah!’ usklikom prekidana. Kruna koncerta našeg umetnika beše, što je na svršetku otpevao srpske na - rodne pesme. Ovim njegovim patriotskim činom mnogo je koristio srpskoj stvari. Naročito kod stavke ‘Gusle moje’ Srbi su plakali.” Anon., “Koncert St. Deskaševa u Skoplju,” Pozorište (October 3, 1896): 150, http://istorijskenovi - ne.unilib.rs/view/index.html#panel:pp|issue:UB_00128_18960310|page:2|query:otpevao . MZ_2022_2_FINAL.indd 97 MZ_2022_2_FINAL.indd 97 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 muzikološki zbornik • musicological annual lviii/2 98 Ivan Zajc’s music, caused a strong emotional response: the lyrics and music induced tears in the Serbian audience. Former members of “Obilić,” many of them already gray haired, mingled with current members, students, and sang the old national song Hej, trubaču! (Hey, trumpet player!). This song, the anthem of the choral society “Obilić,” symbolically connected the old and new generations of “Obilić’s people,” caused tears in the eyes of many. 31 In this case, the sense of belonging to the Obilić Academic Choral So - ciety, founded in 1884, was reinforced by the performance of one of the dis - tinguished conductors of the society, Josif Marinković, the composer of the patriotic song Hej, trubaču! The strong relationships among the members and the continuity of the institution, paired with the performance of Marinković’s piece, resulted in music-elicited tears. Enrichment of Factual Knowledge The NLP analysis of digitized newspapers uncovered certain facts and trends that had not previously been known or observed in musicological studies. Moreover, gathering new evidence of the musical practice in Serbian society in the selected historical time span is one of the most important contributions of this type of research. It is precisely this which expands our knowledge, not just at the level of music-historical facts, but as a basis for either drawing certain new and unique conclusions for particular areas of research or else rethinking various canonical topics. A few examples will illustrate our point: two of them based on a “close” and one on a “distant” reading of the digitized repository. In the description of the Belgrade “slava” 32 (patron feast) celebrated on 10 June 1926, an Orthodox procession (“litije”) was organized after the solemn church service: At around eleven o’clock, a procession was formed to pass through the town. The Patriarch walked under the canopy, and the clergy in front of him. At the end of the procession was military music that played church hymns throughout. 33 31 “Bivši članovi ‘Obilića’, mnogi među njima već sedi, pomešali su se sa sadašnjim članovima, stu - dentima i studentkinjama, i otpevali staru nacionalnu pesmu ‘Hej, trubaču!’. Ova pesma, himna pevačkog društva ‘Obilić’, koja je simbolično povezala stare i nove generacije ‘Obilićevaca’ izazvala je mnogima suze na oči.” Anon., “Slave beogradskih društava,” Beogradske opštinske novine (1938): 9, http://istorijskenovine. unilib.rs/view/index.html#panel:pp|issue:UB_00005_19381101|page:65|query:otpevali . 32 Krsna slava is a particular Serbian religious celebration (a Serbian family’s patron saint day). Many families have their slava, as a memory to the baptism day of their predecessors, but there are also slavas of institutions and settlements – cities or villages. 33 “Oko jedanaest časova formirana je litija za prolazak kroz varoš. Pod nebom je išao Patrijarh, a ispred njega sveštentstvo. Na kraju povorke išla je vojna muzika koja je za celo vreme svirala crkvene arije.” MZ_2022_2_FINAL.indd 98 MZ_2022_2_FINAL.indd 98 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 I. Perković, A. Zečević: Computational Research of Music Criticism between 1878... 99 The information that a military orchestra played church hymns, in the pres - ence of the Patriarch and representatives of the senior Orthodox Church hi - erarchy attracts our attention. It is well known that Orthodox music is purely vocal, and that not a single musical instrument is used in church practice. Even if the town procession did not have such strict rules as the Orthodox rites, it still surprises us that an orchestra took part in it by playing ecclesiastical tunes. What kind of melodies were played by the military musicians? How did the Patriarch and priests respond to this? Was it the regular practice in the 1920s in Belgrade or not? All these topics are worth further exploration using NLP research methods based on broader digital resources or the “close reading” approach. A different, but also interesting finding is the caricature of the Belgrade quartet drawn by Pjer Križanić in 1931 published in the Belgrade newspa - pers. 34 The members of the quartet were teaching staff at the Music School: the violinist Marija Mihailović, the viola player Jovan Zorko, the cello player Juro Tkalčić and the pianist Ćiril Ličar. They were one of the most promi - nent chamber ensembles in Serbian musical life during the eight years of their existence (1925–1933). 35 As far as we know, music historians have not previously identified this caricature, which is proof of their widespread popularity. Another topic, where a “distant reading” approach would give valuable results, is the prominent presence of female performers, particularly choirs, in the musical life of the time. There are 58 references to female choirs in our dataset, mostly in terms of performers or, much less frequently, intended ensemble of the piece. School female choirs are notably present, whether in celebrations of a school slava where they took part in the church ritual, or on other occasions. Bearing in mind the common opinion that female perform - ers and choirs were less visible, due to the historical social norms, the topic is worth further research. As an example, a short passage from the text de - voted to a concert in honour of St. Sava, the first Serbian archbishop and the school patron, held at the Teacher Training School (Preparandija) in Sombor in 1889 is given. The second piece on the programme was Robert Tolinger’s Anon. “Litija kroz Beograd,” Vreme (June 11, 1926), http://istorijskenovine.unilib.rs/view /index.html#panel:pp|issue:UB_00043_19260611|page:5|query:%D0%BB%D0%B8%D1%82 %D0%B8%D1%98%D0%B0%20%D0%BA%D1%80%D0%BE%D0%B7%20%D0%B1%D0% B5%D0%BE%D0%B3%D1%80%D0%B0%D0%B4%20%D0%B7. 34 Kulundžić Zvonimir, “Karikaturista Pjer Križanić,” Beogradske Opštinske Novine 1 (1940): 89, http://istorijskenovine.unilib.rs/view/index.html#panel:pp|issue:UB_00005_19401001|ar - ticle:pageDiv84|page:89|query:%D0%BF%D1%98%D0%B5%D1%80%20%D0%BA%D0%B2 %D0%B0%D1%80%D1%82%D0%B5%D1%82%20%D1%82%D0%BA%D0%B0%D0%BB% D1%87%D0%B8%D1%9B. 35 Cf. Roksanda M. Pejović, “Kamerno muziciranje i njegovi zastupnici,” Novi zvuk – internacionalni časopis za muziku, no. 21 (2003): 95–103. MZ_2022_2_FINAL.indd 99 MZ_2022_2_FINAL.indd 99 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 muzikološki zbornik • musicological annual lviii/2 100 choir Ljubičice (Violets) composed to lyrics by Jovan Grčić Milenko. “Th is song was sung by a female choir, and it can be said with full authority that they are not lagging behind the male choir at all. Tolinger’s art is vividly re- fl ected in this piece as well.” 36 In addition to the female choirs, the corpus contains evidence of solo per- formances where female performers are referenced in various ways as “umet- nice” (artists), “pijanistkinje” (pianists), “pevačice” (singers), or just with a title “gospođica” (Miss) or “gospođa” (Madam). Th e graph with the relevant span distribution is shown below. Th e comparation with the male choir in terms of the quality of the perfor- mance is the guideline for future research of the musicological canon related to female ensembles in the long nineteenth century. 36 “Ovu je pesmu ženski lik otpevao i s punim pravom kazati se može da ni malo nije ustupio muški- ma. I u ovom komadu živo se ogleda umetništvo Tolingerovo.” IVAN., “Svetosavska Beseda u Somboru,” Školski list 15 (1889): 31, http://istorijskenovine.unilib. rs/view/index.html#panel:pa|issue:UB_00015_18890215|article:div115|query:otpevao . Figure 6: Križanić’s caricature of the Belgrade quartet. Figure 7: Statistics of female performances. MZ_2022_2_FINAL.indd 100 MZ_2022_2_FINAL.indd 100 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 I. Perković, A. Zečević: Computational Research of Music Criticism between 1878... 101 Conclusion The digital corpus selected for this study, after the procedure of the narrow - ing down of resources on critical writing in newspapers in Serbian, included 3,751 text spans in 3,442 pages available via the official web application of the Svetozar Marković University Library in Belgrade. The process of multidisci - plinary collaboration aiming to connect the quantitative analysis of the digital resources and musicological research on the music criticism written between 1878 and 1941, showed that there are many challenges, but also great opportu - nities for both disciplinary areas. Even if the preparation of digital material was demanding and required a lot in terms of effort and time, our results encourage and support further study. Some of the primary issues related to the creation of the corpus are the selection of adequate initial search queries, challenges in newspaper page segmentation, (too) many OCR errors, and consequently, out-of-vocabulary words that influenced the accuracy of the NLP tools used. Due to the corpus size, the intention was to perform as many as tasks possible automatically and in an unsupervised manner. From the musicological point of view, narrowing the research terms was one of the challenges, as well as finding the right balance between broader concerns brought by distant reading methods and the intense examination of certain topics that came as a result of close reading. Furthermore, the abun - dance of available digital material was another concern, in terms of defining the most appropriate focus. However, the main goal of the study showed that there are multiple and creative ways to combine computational and musicological approach to assess music criticism in Serbian, so as to identify new approaches to the topic. The community interested in NLP and the Digital Humanities might find interesting the pipeline for the creation of sub-corpora on top of the reach and undiscovered resources provided by the Svetozar Marković University Library. Fortunately, there are initiatives 37 related to the correction of text after OCR, the restoration of diacritics, and switching between the different language vari - ants for Serbian based on the cascades of finite state transducers and e-diction - aries, as well as more machine-learning guided approaches for other languag - es 38 which we would like to investigate and deepen in the future. In addition, sub-corpora related to music and music practices represent a vital starting point for the systematization and creation of the music-related language resources necessary for the analysis of both past and present music sources. 37 Cvetana Krstev and Ranka Stanković, “Old or New, We Repair, Adjust and Alter (Texts),” In- fotheca 19, no. 2 (2019): 61–80, DOI:/10.18485/infotheca.2019.19.2.3 . 38 Thi Tuyet Hai Nguyen et al., “Survey of Post-OCR Processing Approaches,” ACM Computing Surveys 54, no. 6 (2021): 1–37, DOI:/10.1145/3453476. MZ_2022_2_FINAL.indd 101 MZ_2022_2_FINAL.indd 101 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 muzikološki zbornik • musicological annual lviii/2 102 Musicological analysis of the selected corpus directed towards discourse changes in the selected historical periods (before the First World War and in the interwar times) offered the possibility to reinforce the traditional findings that the nineteenth century approach was more informative and educational in nature, while in the interwar period more attention was paid to stylistic and technical aspects, as well as the layers of musical performances and pieces. Furthermore, even if the current state of the digital material does not provide sufficient opportunities for research in broader “strokes,” the collected data provided interesting prospect for one of the most fascinating themes in music research: the topic of music induced tears/crying. However, it was explained that, in the available critical writing in Serbian, strong emotions were stimulat - ed by the feeling of belonging (whether national and religious), more than by the music itself. Finally, experiments with NLP methods demonstrated excit - ing possibilities for the gathering of new evidence regarding musical practice in Serbian society between 1878 and 1941. Although further work is necessary to develop/refine the computational tools designed for music-related texts – particularly in the field of NLP – and to create new open access repositories of digitized newspapers, this study has shown that it is not possible to deny the benefits of the digital endeavours in the field of musicology. References [Anon.] “Beogradske vesti.” Male novine (March 15, 1890). [Anon.] “Koncert St. Deskaševa u Skoplju.” Pozorište (October 3, 1896). [Anon.] “Litija kroz Beograd.” Vreme (November, 1926). [Anon.] “Slave beogradskih društava.” Beogradske opštinske novine (November, 1938). Gabrielsson, Alf. “Emotions in Strong Experiences with Music.” In Music and Emo- tion: Theory and Research , 431–449. Series in Affective Science. New York: Oxford University Press, 2001. Gabrielsson, Alf, and Siv Lindström Wik. “Strong Experiences Related to Music: A Descriptive System.” Musicae Scientiae 7, no. 2 (2003): 157–217. DOI:/10.1177/102 986490300700201. IVAN. “Svetosavska beseda u Somboru.” Školski list 15 (1889). Joubert, Estelle. “‘Distant Reading’ in French Music Criticism.” Nineteenth-Century Mu- sic Review 19, no. 2 (2021): 291–315. DOI:/10.1017/S1479409820000476. Juslin, Patrik N. Musical Emotions Explained: Unlocking the Secrets of Musical Affect. New York: Oxford University Press, 2019. Kr., P . J. “Koncert Akademskog Pevačkog Društva ‘Obilić.’” Pravda (February 23, 1927). Krstev, Cvetana, and Ranka Stanković. “Old or New, We Repair, Adjust and Alter (Texts).” Infotheca 19, no. 2 (2019): 61–80. DOI:/10.18485/infotheca.2019.19.2.3 . Kulundžić Zvonimir. “Karikaturista Pjer Križanić.” Beogradske Opštinske Novine 1 (1940). Liang, Yao, and Hu Wang. “Sentiment Analysis of Music Criticism Based on Data Min - ing.” Advances in Social Science, Education and Humanities Research (2018): 368–371. DOI:/10.2991/iceemr-18.2018.84 . MZ_2022_2_FINAL.indd 102 MZ_2022_2_FINAL.indd 102 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 I. Perković, A. Zečević: Computational Research of Music Criticism between 1878... 103 Ljubešić, Nikola, and Kaja Dobrovoljc. “What does Neural Bring? Analysing Improve - ments in Morphosyntactic Annotation and Lemmatisation of Slovenian, Croatian and Serbian.” In Proceedings of the 7th W orkshop on Balto-Slavic Natural Language Pro- cessing. Association for Computational Linguistics, 2019: 29–34. DOI: 10.18653/v1/ W19-3704. Moretti, Franco. Distant Reading. London; New York: Verso, 2013. Mori, Kazuma, and Makoto Iwanaga. “Two Types of Peak Emotional Responses to Music: The Psychophysiology of Chills and Tears.” Scientific Reports 7, no. 1 (2017): 460–463. DOI:10.1038/srep46063 . Nguyen, Thi Tuyet Hai, Adam Jatowt, Mickael Coustaty, and Antoine Doucet. “Survey of Post-OCR Processing Approaches.” ACM Computing Surveys 54, no. 6 (2021): 1–37. DOI:/10.1145/3453476. Pejović, Roksanda. “Kamerno muziciranje i njegovi zastupnici.” Novi zvuk – internacion- alni časopis za muziku, no. 21 (2003): 95–109. Pejović, Roksanda. Koncertni život u Beogradu. Beograd: Fakultet muzičke umetnosti, 2004. Pejović, Roksanda. Kritike, članci i posebne publikacije u srpskoj muzičkoj prošlosti (1825– 1918). Beograd: Fakultet muzičke umetnosti, 1994. Pejović, Roksanda. Muzička kritika i esejistika u Beogradu (1919–1941). Beograd: Fakultet muzičke umetnosti, 1999. Pejović, Roksanda. Opera i balet Narodnog pozorišta u Beogradu (1882–1941). Beograd: [s. n.], 1996. Pejović, Roksanda. Pisana reč o muzici u Srbiji: Knjige i članci (1945–2003). Beograd: Fa - kultet muzičke umetnosti; Signature, 2005. Pejović, Roksanda. Srpsko muzičko izvođaštvo romantičarskog doba. Beograd: Univerzitet umetnosti, 1991. Perković Radak, Ivana. Od anđeoskog pojanja do horske umetnosti: Srpska horska crkvena mu- zika u periodu romantizma (do 1914. godine). Beograd: Fakultet muzičke umetnosti, 2008. Sloboda, John A. “Music Structure and Emotional Response: Some Empirical Findings.” Psychology of Music 19, no. 2 (1991): 110–120. DOI:/10.1177/0305735691192002. Soper, Elizabeth, Stanley Fujimoto, and Yen-Yen Yu. “BART for Post-Correction of OCR Newspaper Text.” In Proceedings of the 7th Workshop on Noisy User-generated Text (W-NUT 2021). Association for Computational Linguistics, 2021: 284–290. DOI:/10.18653/v1/2021.wnut-1.31 . Vasić, Aleksandar. “Serbian Music Criticism in the First Half of the Twentieth Cen - tury: Its Canon, Its Method and Its Educational Role.” Muzikologija, no. 8 (2008): 185–202. Weliver, Phyllis, and Katharine Ellis, eds. Words and Notes in the Long Nineteenth Cen- tury. Woodbridge, UK; Rochester, NY: Boydell & Brewer, 2013. MZ_2022_2_FINAL.indd 103 MZ_2022_2_FINAL.indd 103 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 muzikološki zbornik • musicological annual lviii/2 104 POVZETEK Računalniška analiza glasbene kritike med letoma 1878 ter 1941 v srbskem jeziku: Kaj se lahko naučimo iz digitalnega arhiva zgodovinskih časopisov v Univerzitetni knjižnici Svetozarja Markovića? Proces digitalizacije je vplival na veliko področij muzikoloških raziskav, vključno s področjem glasbene kritike. Vendar se računalniška analiza o recepciji glasbenikov, glasbenih del ter izvajalcev le-teh v srbščini do sedaj ni izvajala zaradi različnih vzrokov. Ključni med njimi so: 1) omejena količina digitaliziranih virov in 2) zahteve t. i. cevovoda procesiranja narav - nih jezikov ( NLP pipeline) v okviru jezika glasbe in zgodovinskih virov. Multidisciplinarna raziskava, ki je bila izvedena za potrebe tega članka – po našem vedenju prva te vrste, ki se ukvarja s srbskimi viri – se je začela z definicijo arhivskega depoja; v glavnem je šlo za kritiško pisanje, ki je zdaj na voljo v digitalnem formatu. Sama definicija je bila iterativna, vodila so jo načini podatkovne analize in prakse nenadziranega strojnega učenja, morala pa se je ujemati tudi z glavnim ciljem študije: raziskati, kako združiti priložnosti, ki jih ponuja računalniška kvantitativna analiza, s tradicionalnimi kvalitativnimi muzikološkimi metodami raziskovanja ter nakazati, kaj lahko to sodelovanje doprinese k raziskovalnim metodam glasbe, posebej v smislu potencialnih področij raziskav. Ustvarjeni depo predstavlja sub-korpus digitaliziranih časopisov v srbščini, ki ga je prip - ravila Univerzitetna knjižnica Svetozarja Markovića v okviru projekta Europeana Časopisi. Izbor relevantnih virov, pred-procesiranje in analiza teksta so bile zaznamovane s številnimi metodološkimi izzivi; nekateri so bili povsem tehnični in v povezavi z zbiranjem podatkov ter pridobivanjem informacij ter drugi, bolj računalniški, povezani z ocenjevanjem kvalitete pisanja tekstov, izboljšavami optičnega prepoznavanja znakov (OCR) in zmožnostjo orodij za procesiranje naravnih jezikov, da sledijo glasbeni terminologiji in kronološkim spremem - bam v jeziku. Na osnovi tega depoja tekstov, ki so bili objavljeni v 63 letih – večinoma v časopisih – med letom 1878 (datum neodvisnosti Srbije) ter začetkom druge svetovne vojne v Jugoslaviji (1941), smo kvalitativne ugotovitve združili z rezultati eksplorativne analize podatkov z namenom, da bi preučili nekatere izbrane teme, ki so relevantne za raziskovanje glasbene kritike. Za potrebe tega članka so bile izbrane tri glavne teme, ki so najbolj relevantne v tradicional - nih muzikoloških raziskavah o glasbeni kritiki v srbščini: 1) spremembe v diskurzu v izbranem zgodovinskem obdobju, 2) jezik, ki se nanaša na emocionalni učinek določene izvedbe oz. nove glasbene kompozicije, in 3) obogatitev našega dejanskega znanja o glasbeni zgodovini. Možne smeri nadaljnjega raziskovalnega dela ne izključujejo drugih tem in področij, ki se lahko izluščijo iz ogromne zbirke zbranega materiala; nasprotno – v prihodnosti se odpirajo možnosti širšega ter hkrati bolj pretanjenega uvida, odvisno od izbire raziskovalnega diskurza. MZ_2022_2_FINAL.indd 104 MZ_2022_2_FINAL.indd 104 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59 I. Perković, A. Zečević: Computational Research of Music Criticism between 1878... 105 ABOUT THE AUTHORS IVANA PERKOVIĆ (ivanaperkovic@fmu.bg.ac.rs ) is a musicologist and professor at the Department of Musicology of the Faculty of Music, University of Arts in Belgrade. She is author and co-author of five books (on Serbian religious music, history of Serbian music, Faculty of Music, music and interdisciplinarity), over 60 articles and chapters in peer-reviewed journals and monographs. Her research fields include Orthodox, particularly Serbian, music; music references in literature; interdisciplinary approach to music; topics of Self and Other in the context of European 18th century music; digital musicology, etc. She is member of the: IMS, SMS, ISOCM, editorial board of the Matica Srpska Journal of Stage Arts and Music, the Commission for Acquiring Scientific Title of the Ministry of Education, Science and Technological Development (MESTD) of the Republic of Serbia, Secretary General of the European Association of Conservatoires (AEC). She is experienced in creating/leading many national and international academic and research projects.  ANĐELKA ZEČEVIĆ (andjelkaz@mi.sanu.ac.rs ) is a Ph.D. candidate in the domain of Natural Language Processing at the Faculty of Mathematics, University of Belgrade. She is currently employed as a researcher at the Mathematical Institute of the Serbian Academy of Sciences and Arts. O AVTORICAH IV ANA PERKOVIĆ (ivanaperkovic@fmu.bg.ac.rs ) je profesorica na Oddelku za muzikolo - gijo na Fakulteti za glasbo Univerze za umetnost v Beogradu. Je avtorica ali soavtorica petih knjig (o srbski verski glasbi, zgodovini srbske glasbe, glasbeni fakulteti, glasbi in interdiscipli - narnosti) ter več kakor 60 člankov in poglavij v recenziranih revijah in monografijah. Njena raziskovalna področja so pravoslavna glasba (zlasti srbska), glasbene reference v literaturi, interdisciplinarni pristop k glasbi, tema Sebe in Drugega v kontekstu evropske glasbe 18. stoletja, digitalna muzikologija idr. Je članica nacionalnih in mednarodnih združenj, kot so IMS, SMS in ISOCM, članica uredniškega odbora Revije za uprizarjajoče umetnosti in glasbo Matice Srpske, komisije za pridobivanje znanstvenih nazivov na Ministrstvu za izobraževanje, znanost in tehnološki razvoj (MESTD) Republike Srbije ter generalna sekretarka Evropskega združenja konservatorijev (AEC). Ima izkušnje s pisanjem predlogov ter vodenjem številnih nacionalnih in mednarodnih akademskih in raziskovalnih projektov. ANĐELKA ZEČEVIĆ (andjelkaz@mi.sanu.ac.rs ) je doktorska kandidatka na področju pro - cesiranja naravnega jezika na Fakulteti za matematiko Univerze v Beogradu. Trenutno je za - poslena kot raziskovalka na Inštitutu za matematiko Srbske akademije znanosti in umetnosti. MZ_2022_2_FINAL.indd 105 MZ_2022_2_FINAL.indd 105 10. 02. 2023 13:51:59 10. 02. 2023 13:51:59